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@ Programmable function keys for a networked personal imaging computer system. 



® A personal imaging computer system (PICS) in- 
cludes a plurality of programmable function keys 
which can be programmed so as to cause the PICS 
equipment to perform at least one of plural selec- 
table image-processing tasks. The PICS equipment 
is connected to a computerized local area network, 
and the programmable function keys can be pro- 
grammed by LAN users from their workstations. 
Preferably, the programmable function keys are di- 
vided into two groups, one of the groups being 
restricted to programming by the network admin- 
istrator and the other group having unrestricted pro- 
gramming. The PICS equipment includes a display 
which displays an image of the plural function keys, 
and in response to operator selection of an image of 
one of those plural function keys, the function per- 
formed by that key is displayed. When the physical 
function key itself is manipulated, the PICS equip- 
ment executes the programmed imaging processing 
tasks. 
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■ BACKGROUND OF THE INVENTION 
Field Of The Invention 

The present invention relates to an optical 
character recognition system, and more particularly 
to methods and apparatuses for scanning and stor- 
ing images of documents in a computer, for seg- 
menting images of the document into text and non- 
text blocks, and for determining the identity of 
characters in the text blocks. 

Description Of The Related Art 

In recent years, it has become possible to scan 
in paper copies of documents so as to form com- 
puterized images of such documents, and to ana- 
lyze images in text areas of the document so as to 
recognize individual characters in the text data and 
to form a computer readable file of character codes 
corresponding to the recognized characters. Such 
files can then be manipulated in word-processing, 
data-compression, or other information processing 
programs, and can also be used to retrieve the 
images of the documents in response to a query- 
based search of the text data. Such systems, which 
are hereinafter referred to as "character recognition 
systems", are advantageous because they elimi- 
nate the need to re-type or otherwise re-enter text 
data from the paper copies of the documents. For 
example, it is possible to recognition-process a 
document which has been transmitted by facsimile 
or reproduced from microfilm or by a photocopier 
so as to form computer text files that contain 
character codes (for example. ASCII character 
codes) of the characters and the numerals in the 
document. 

Conventional character recognition systems 
scan the paper copy of the document to form a 
binary image of the document. "Binary image" 
means that each pixel in the image is either a 
binary zero, representing a white area of the docu- 
ment or, a binary one. representing a black area. 
The binary image (or "black-and-white image") is 
thereafter subjected to recognition processing so 
as to determine the identity of characters in text 
areas of the document. 

It has recently been discovered that recognition 
accuracy can be improved dramatically if the paper 
document is scanned to form a gray-scaie image of 
the document. "Gray-scale" means that each pixel 
of the document is not represented by either a 
binary one or a binary zero, but rather is repre- 
sented by any one of more than two intensity 
levels, such as any one of four intensity levels or 
16 intensity levels or 256 intensity levels. Such a 
system is described in commonly-assigned appli- 
cation Serial No. 08/112.133 filed August 26. 1993, 



"OCR Classification Based On Transition Ground 
Data", the contents of which are incorporated here- 
in by reference as if set forth in full. In some cases, 
using gray-scale images of documents rather than 
5 binary images improves recognition accuracy from 
one error per document page to less than one error 
per 500 document pages. 

Figure 1 illustrates the difference between bi- 
nary images and gray-scale images, and assists in 
10 understanding how the improvement in recognition 
accuracy, mentioned above, is obtained. Figure 1- 
(a) illustrates a character "a" over which is super- 
imposed a grid 1 representing the pixel resolution 
with which the character "a" is scanned by a 
75 photosensitive device such as a CCD array. For 
example, grid 1 may represent a 400 dot-per-inch 
(dpi) resolution. A binary image of character "a" is 
formed, as shown in Figure 1(b). by assigning to 
each pixel a binary one or a binary zero in depen- 
20 dence on whether the character "a" darkens the 
photosensitive device for the pixel sufficiently to 
activate that pixel. Thus, pixel 2a in Figure 1(a) is 
completely within a black portion of character "a" 
and results in black pixel 2b in Figure 1(b). On the 
25 other hand, pixel 3a is completely uncovered and 
results in white pixel 3b. Pixel 4a is partially cov- 
ered but insufficiently covered to activate that pixel 
and therefore results in white pixel 4b- On the other 
hand, pixel 5a is covered sufficiently so as to 
30 activate it and results in black pixel 5b. 

Figure 1(c) shows a gray-scale image of the 
same character "a". As shown in Figure 1(c). pixels 
which are completely covered (2a) or uncovered 
(3a) result in completely black or white gray-scale 
35 levels, the same as in Figure 1(b). On the other 
hand, pixels which are partially covered are as- 
signed a gray level representing the amount of 
coverage. Thus, in Figure 1(c) which shows a four- 
level gray-scale image, pixel 4c receives a low 
40 gray-scale value and pixel 5c receives a higher 
gray-scale value due to the relative coverage of 
pixels 4a and 5a, respectively. Thus, because of an 
artifact of the scanning process, an original black 
and white document, as shown in Figure 1(a), can 
45 be scanned into a gray-scale image as shown in 
Figure 1(c) with gray-scale values being assigned 
primarily at character edges and being dependent 
on coverage of the pixels. 

A comparison of Figures 1(b) and 1(c) shows 
50 that there are additional details in Figure 1(c), es- 
pecially at character edges. This additional detail is 
primarily responsible for improved recognition ac- 
curacy. 

A problem still remains, however, in how to 
55 extract individual gray-scale images of characters 
from a gray-scale image of a document so as to 
send the individual gray-scale character image for 
recognition processing. More particularly, recogni- 
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tion accuracy depends greatly on the ability to 
determine where one character begins and another 
ends so that only a single character, rather than a 
group of characters, is subjected to recognition 
processing. 

Figure 2 illustrates this situation and shows a 
page of a representative document. In Figure 2, a 
document 10 is arranged in two-column format. 
The document includes title blocks 12 which in- 
clude text information of large font size suitable for 
titles, a picture block 13 which includes a color or 
halftone picture, text blocks 14 which include lines 
of Individual characters of text information, a graph- 
ic block 15 which includes graphic images which 
are non-text, a table block 16 which includes tables 
of text or numerical information surrounded by non- 
text borders or frames, and caption blocks 17 
which include text information of small font size 
suitable for captions and which are normally asso- 
ciated with blocks of graphic or tabular information. 

When document 10 is scanned to form a gray- 
scale image of the document, prior to recognition 
processing, it is necessary to determine which 
areas of the gray-scale image are text areas and 
which are non-text areas, and also to determine, for 
the text areas, where individual characters are lo- 
cated. This processing is hereinafter referred to as 
"segmentation processing". Only after segmenta- 
tion processing has located individual characters 
can the images of those characters be subjected to 
recognition processing so as to identify the char- 
acters and to form a text file of the characters. 

Conventional segmentation processing tech- 
niques for binary images are generally unsatisfac- 
tory in that they do not accurately separate text 
from non-text areas and they do not accurately 
identify the location of individual characters in the 
text areas. Moreover, for gray-scale images, no 
segmentation processing techniques are currently 
known. 

Furthermore, the variety of available Image pro- 
cessing techniques is increasing and it is becoming 
increasingly typical for a user to chain different 
image processing tasks together so as to achieve a 
desired output. For example, to build a searchable 
database of image files, a user may need to scan 
in some documents to create some image files, 
retrieve some other image files from an existing 
database, subject all image files to recognition- 
processing to create text files, and store the text 
files in association with the image files. However, 
as the variety of Image processing tasks increases, 
it is rapidly becoming more difficult for the user to 
chain those tasks together. This is especially true 
when the user is operating on a computerized local 
area network and the image processing center is 
physically remote from where the user is located, 
since the user must remember the image process- 



ing tasks as he walks from his location to that of 
the image processing center, 

SUMMARY OF THE INVENTION 

5 

The preferred embodiment of the present in- 
vention provides an improved gray-scale character 
recognition system that Is operable on a computer- 
ized local area network and which includes a plural- 

10 ity of programmable function keys which allow us- 
ers to specify in advance which image processing 
tasks are needed. 

According to an embodiment of the invention, a 
personal imaging computer system includes imag- 

75 ing computer means connectable to a local area 
network (LAN), the imaging computer means for 
performing selectable image-processing tasks on 
document images. The system includes plural pro- 
grammable function keys which may be pro- 

20 grammed by network users from their individual 
workstations and which can be selected by the 
network user when they physically arrive at the 
system to perform image processing tasks. If de- 
sired, the plural programmable function keys may 

25 be divided into two groups, one of the groups 
being programmable only by a network administra- 
tor, and the other of the groups being program- 
mable by any LAN user. Display means are pro- 
vided for displaying an image of the plural function 

30 keys. In response to operator selection of an image 
of one of the plural function keys, the display 
means displays the function performed by that key, 
and in response to operator manipulation of the 
physical function key, the personal imaging com- 

35 puter system performs the image processing task 
programmed by that key. 

In an alternative aspect of the invention, in- 
dividual characters in a gray-scale image of a doc- 
ument are extracted for recognition processing by 

40 thresholding the gray-scale image to obtain a bi- 
nary image, segmentation-processing the binary 
image to locate individual characters within the 
binary image and to determine the shape of the 
individual characters, and using the location and 

45 shape of the binary image to extract the gray-scale 
image of each individual character from the gray- 
scale image. The extracted gray-scale image of 
each character is then subjected to recognition 
processing. 

50 Thus, a character recognition system in one 

embodiment of the invention identifies characters in 
a document on which the characters are formed by 
scanning the document to obtain a gray-scale im- 
age of the document, and by generating a binary 

55 image from the gray-scale image, by comparing the 
gray-scale image with a threshold. The binary im- 
age is segmented to locate individual characters 
within the binary image and to determine the shape 
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of the individual characters. Based on the location 
and the shape of the character in the binary .mage, 
gray-scale image information is extracted from the 
gray-scale image for each such individual char- 
acter. The extracted gray-scale image is then rec- 
ognition-processed to determine the identity of the 
character, and the identity of the character .s ■ 
stored in a computer-readable file. 

Improved recognition accuracy can also be ob- 
tained by not only recognition-processing the gray- 
scale image of the character, as described above, 
but by additionally recognition-processing the bi- 
nary image of the character. Any inconsistencies 
between identities determined from the gray-scale 
image and the binary image are resolved (or dis- 
ambiguated") based on physical image attributes, 
such as aspect ratio and pixel density, of the 
binary image of the character. 

Additional recognition accuracy can be ob- 
tained by determining font characteristics of the 
characters, for example, by determining whether 
the characters are uniformly spaced, proportionally 
spaced, sans-serif, or the like. Based on the font 
characteristics, one of plural recognition processing 
techniques is selected, and as each character is 
extracted, as described above, the extracted char- 
acter is subjected to recognition processing in ac- 
cordance with the selected recognition processing 

technique. ^ ^ 

Once text within a document has been iden- 
tified and stored as a computer readable file, that 
text file may be used to retrieve the document 
image, for example, using a query-based search so 
as to retrieve a corresponding document image. 

Because recognition processing techniques re- 
quire a resolution for the document image that is 
much higher than required for normal human visual 
acuity, one aspect of the invention is directed to a 
document storage and retrieval system which, 
when compared to conventional systems, reduces 
the amount of storage that is needed. According to 
this aspect of the invention, a document storage 
and retrieval system involves scanning a document 
at a first resolution to form a gray-scale image of 
the document, the first resolution being suitable for 
recognition-processing text in the document. Text 
in the document is recognition-processed so as to 
obtain a computer-readable file of the text, and the 
resolution of the gray-scale image is then reduced 
to a second resolution lower than the first resolu- 
tion the second resolution being suitable for visual 
perception and reproduction of the image. Only the 
reduced resolution image is stored, and it is stored 
in association with the computer readable file so. 
that the image may later be retrieved using a, 
query-based search. 

By virtue of this arrangement, since a lowered 
resolution image is stored, memory storage re- 



quirements are reduced and more images can be 
stored Moreover, processing speed is increased 
since the amount of image data is smaller and it 
can be moved, compressed and decompressed. 
5 and otherwise processed, more quickly. 

This brief summary has been provided so that 
the nature of the invention may be understood 
quickly. A more complete understandmg of the 
invention can be obtained by reference to the fol- 
,0 lowing detailed description of the preferred em- 
bodiment thereof in connection with the attached 
drawings. 

The various aspects of this invention may be 
used separately or in combination, and the follow- 
,5 ing description is by way of non-limiting example. 



RRIFF DESCRIPTION O F THE DRAWINGS 

Figures 1(a). 1(b) and 1(c) are views for ex- 
20 plaining differences between binary images and 
qray-scale images. 

Figure 2 is an illustration of a representative 

document page. 

Figure 3 is a partially cut-away view of the 
25 outward appearance of a personal imaging com- 
puter system according to the invention. 

Figure 4 is a diagram explaining network con- 
nection of the Figure 3 apparatus. 

Figure 5 is a detailed block diagram of the 
30 internal construction of the Figure 3 apparatus 

Figures 6 and 7 are close-up views of the 
control panel of the Figure 3 apparatus. 

Figure 8 is a flow diagram for explaining docu- 
ment storage and retrieval. 
35 Figures 9-1, 9-2 and 9-3 are flow diagrams for 

explaining optical character recognition according 
to the invention. 

Figures 10(a) and 10(b) are flow diagrams for 
explaining how to de-skew images. 
40 Figures 11(a), 11(b) and 11(c) are representa- 

tive views of skewed and de-skewed pixel images. 

Figure 12 is a flow diagram for explaining how 
to form binary images from gray-scale images by 
thresholding. 

45 Figures 13(a) and 13(b) are representative 

histograms of a gray-scale image. 

Figure 14 is a flow diagram for explaining seg- 
mentation processing according to the invention. 
Figure 15 is a view for explaining derivation of 
50 connected components within an image. 

Figure 16 shows image attributes stored for 
each connected component. 

Figure 17 is an example of how image pro- 
cessing affects the image of the word 'finally". 
55 Figure 18. which includes Figures 18(a) and 

18(b). is a flow diagram tor explaining underline 
removal. 
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Figures 19(a) through 19(e) are views showing 
sequentiai stages of underline removal and how 
those stages affect an image of underlined char- 
acters. 

Figure 20 is a flow diagram for explaining con- 
nected component analysis. 

Figure 21 is a view showing how connected 
components are derived for an image of the word 
"UNION". 

Figure 22, which includes Figures 22(a) 
through 22(f) is a flow diagram showing rule-based 
processing of connected components. 

DETAILED DESCRIPTION OF THE PREFERRED 
EMBODIMENT 

The structure of one representative embodi- 
ment of the invention is shown in Figures 3, 4 and 
5. and the operation of the representative embodi- 
ment is explained in the remaining figures. The 
embodiment described here is a "personal imaging 
computer system", that is, a single stand-alone 
device that contains document scanning, storage 
and processing equipment which is connectable to 
a computerized local area network or wide area 
network. Equivalent general purpose components 
may be substituted for the equipment described 
herein. It is possible, for example, to substitute a 
general purpose programmable computer with suit- 
able peripheral equipment. 

[1.1 Personal Imaging Computer System] 

Figure 3 is a partially cut away perspective 
view of the outward appearance of a personal im- 
aging computer system ("PICS") that incorporates 
a gray-scale character recognition system accord- 
ing to the invention. As shown in Figure 3. PICS 
equipment 20 includes, in one housing, a docu- 
ment feed section 21 upon which a stack of paper 
documents may be placed and from which 1he 
paper documents are fed one sheet at a time 
through a document scanner section 22. The docu- 
ment scanner section 22, which preferably includes 
a dual-side scanner, scans each document page 
using a CCD line-array to create a gray-scale im- 
age of the document. After scanning, the document 
pages are ejected to eject tray 23 upon which they 
are stacked. Blank document sheets in paper stor- 
age tray 25 (or in an unshown paper cartridge) are 
also fed by PICS equipment 20 past printer section 
26 which forms toner images on the blank sheets 
and feeds the newly-printed documents to eject 
tray 27. 

PICS equipment 20 further includes a fac- 
simile/modem interface (shown in Figure 5) by 
which PICS equipment 20 interfaces to an ordinary 
voice/data telephone line so as to engage In data 
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and facsimile communication with remote comput- 
ers and so as to permit an operator to engage in 
ordinary voice communication via telephone hand- 
set 30. Interfaces are also provided to a local area 

5 network 31 and a wide area network 32 so as to 
allow communication with users on remote work- 
stations via such networks. 

Operator controls and displays are provided at 
control panel 34. Control panel 34 includes a flat 

70 panel display screen 35 such as a VGA liquid 
crystal display panel. A trackball 36 is provided so 
as to permit an operator to manipulate a cursor 
displayed on display screen 35 and so as to permit 
an operator to select objects on the display screen. 

75 An ordinary telephone keypad is provided at 33, 
conventional facsimile control buttons are provided 
at 37, and start/stop buttons are provided at 38. 
Programmable function keys are provided at 39 so 
as to permit an operator to control various image 

20 processing operations of PICS equipment 20. 

PICS equipment 20 includes a general purpose 
computer (described in further detail in Figure 5) 
whereby an operator is able to scan in documents, 
segmentation-process and recognition-process the 

25 documents to create text files corresponding to text 
areas in the documents, print out document im- 
ages, manipulate document images and text files 
via trackball 36 and display screen 35, and send 
and receive documents and images via facsimile. 

30 Other information processing techniques, such as 
word-processing, image-processing and spread- 
sheet-processing may be carried out by the oper- 
ator in accordance with software loaded into PICS 
equipment 20, thereby providing the operator with 

35 a powerful personal imaging computer system to- 
gether with a general purpose computer system for 
other information processing projects. 

[1.2 Computerized Network Connection] 

40 

When connected to a local area network 31 
and/or a wide area network 32, PICS equipment 20 
provides the capabilities described above to com- 
puterized network users. More particularly, as 

45 shown in Figure 4, PICS equipments 20 may be 
connected to a local area network 31. Plural work- 
stations, such as workstations 40, are also con- 
nected to local area network 31 , and under control 
of the network operating system, the workstations 

50 40 are able to access the imaging capabilities of 
PICS equipment 20. One of the workstations, such 
as workstation 43, may be designated for use by a 
network administrator. Also connected to local area 
network 31 is a file server 41 which manages 

55 access to files stored on network disk 42. A print 
server 44 provides print services to printers 45. 
Other unshown peripherals may be connected to 
local area network 31. By virtue of this arrange- 

5 
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.ent, operators at one of -^^^-"^nt S, - 

scan in a document -^^-^^^^^tion P^^^^^^ the 
segmentation-process and recogn.t, P^^ 

document image so as t° ^^^re the 

responding to text a^eas of^^^^^^^^^^^^ 
document .mage and -.^^ge and its 

associated text me lu oriainal or manipu- 

workstation 40. and pnnt out the o '^ 
lated document .mage and text 
printers 45. network such as that 

inustrated at 31 serv.c ^^^^^ 
users such as a 9^°"^ ^ ^sers become 

contiguous floors .n a t'^' J'^Q- '^^.^^pie, in dif- 
n.ore remote from ^^If^/'^.TT^i^e area 
ferent buildings or *«erent ^^te ^ ^ 
network may be created wh.c^ - ^„ 
collection of several ^^'^^J^^^^^ ^^^^ as high 
nected by high speed d.g^al hnes 
speed ISDN telephone l.nes. ^h^^ are 
F^ure 4, local area network 3 • 
connected to form a ^^^^^^^^ ,o. Each 
modem/transponder 49 ana d ^gt^tions, 
,oca. area network .ncludes ts own ^^^^^ 
and each °^*nar.ly .ncludes ts o ^^^^ 
print server although that s ^^^^ 
case. Thus, as shown .n ^9"^^ 4. '° 

u AR includes workstations 51, f.ie serve 
work 46 inciuae!> printers 56. 

network disk 54, pr.nt ^^^^^^^^"^Jd. includes 
Local area network 48, on t^e oth- 
only workstat.ons 57 Via networks 
nections. equipment f ^^^^ 3 of equip- 
46 or 48 can access the capau. 
! ,nv of the other local area networks. Thus, 
ment .n any ot tne ou « workstat.ons 

,or example, it is P°.^^"^'^^^°^ ^^'^u Les of PICS 
57 to access the .mag-ng "P^^^*^^^,^ ^.an- 

epuipment 20 via '^-'^^'^^f'^^^/^^.jtr one of work- 
sponder 49 'J^;-- a' dofu-ent image from 
stations 51 to retrieve segmentation and 

network disk 42, sub.ect it to seg 

recognition P^°«=^f '"9 °" ^^^s at workstation 51, 
ceive and manipulate ^^^ ^ a^"" ..nters 56. 
and print out documentation on one 0 P ^ 

Other combinations considered 
the above examples should noi u 

limiting. 

[1 .3 Internal Construction] 



Pioure 5 is a detailed block diagram showing 
Sal construction and connections of t e 
presently preferred embodiment of FMCS e<, p^ 

-nt 20 in -f-^e^sCipren; 20 includes 
shown m Figure 5, ^ eq P 
central processing unit CPU o 
80486DX or reduced '"^^^^^J^" 3^6^ Also inter- 
("RISC") interfaced to computer bus 6I. /^s 



hM<; 61 is Ethernet interface 62 
faced to computer bus 61 'S 

tor interfacing to local area -^^''l^^' 32, 
face 64 for interfacing to wide area 
rodem/tacslmile/voice -'^pX^X^^^^^^ 
5 providing appropriate "^o^^'^ '^fS printer inter- 
phone interface to telephone hne 29^ P^j;^ 

^-^:^:^TS proS appCiate pap- 
feed interface 67 tor pro y document 
feeding commands for feeding 
,0 tray 21 past scanner 22 and ^) ^ 
for feeding out of paper supn y 

and to eject tray 27. .„.^Haces between dis- 

A display interface 69 '"^e^aces betw 

p,ay 35 and computer bu j^ and^^^^^ 
rercrputrbr6i;^^^^^^^^^^^ 36 and keys 
^%omputer bus ei interfaces to. .^^^^^^ 

More particularly, as ^^^^^^/''^s^a.^er interface 

and the pixel ^^e'^^^rfend? the pixel data to 
71, scanner '"terfacej ^^^^ 
JPEG processor 72 so that t ^^P^.^^ 
,5 pressed ^^"^ f f '^^^^^^ 

pressed P'>^«' ^^^^ of the device by pro- 
S;:UKr.PETcrmpression as a document 

3„ '^irrprefrd for compression P^^^^^^^^^^^ 
perform JPEG compress.on s -e ^PEG^^^ P 
sion is ^;;7:;:l^'"Hrever, other types of 

^:^^<:ier:.^e^o.^ .s. com. 

3s pression like -JPEJ is preferab - 
,n addition, JPEG Pjoces^o 
,red, via ^^---^^^ ^^..^bit-map pixel data. The 

^0 rre.^^rs;f^^^^^^^^^ 

scale pix.B «"C»v, ' , ?7i„ » dMct 

connection so as w p ^ata at any seiec- 

« « u i 

table threshold level, ey vui ^^^^ 
jPEG-compressed 'n;age Wes can^be p 

nuicklv and without the need for sonwar 
quickly a ^.^^ ^^^^g^jg^ jpeq 

'"oSsor 72 and hence directly to print interface 
Kin'ry thresholding if des.e. 

which CPU 60 manipulates and creates 
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files. In particular, disk 75 includes stored program 
instruction sequences which segmentation-process 
gray-scale images of documents so as to distin- 
guish between text and non-text regions of the 
document image and so as to extract individual 
characters from the text regions, and stored pro- 
gram instruction sequences which recognition-pro- 
cess images of characters so as to determine the 
identity of the characters. Suitable recognition pro- 
cessing techniques include, without limitation, fea- 
ture and/or stroke extraction systems which extract 
feature and/or stroke information from character 
images for comparison to dictionaries of such in- 
formation, neural network recognition systems 
which mimic human neural interconnectivity so as 
to identify character images, and hybrid systems 
which contain aspects of both feature/stroke rec- 
ognition systems and neural network recognition 
systems. 

Read only memory ("ROM") 77 interfaces with 
computer bus 61 so as to provide CPU 60 with 
specialized and invariant functions such as start up 
programs or BIOS programs. Main random access 
memory ("RAM") 79 provides CPU 60 with mem- 
ory storage both for data and for instruction se- 
quences, as required. In particular, when executing 
stored program instruction sequences such as seg- 
mentation programs or character recognition pro- 
grams, CPU 60 normally loads those instruction 
sequences from disk 75 (or, in the case of network 
access, from other program storage media) to RAM 
79 and executes those stored program instruction 
sequences out of RAM. Working storage areas for 
data manipulation are also provided in RAM and, 
as shown in Figure 5, include working storage 
areas for gray-scale images, binary images, con- 
nected components and text files. 

[2.0 Operation] 

Operation of the above-described representa- 
tive embodiment of the invention will now be de- 
scribed in connection with remaining Figures 6 
through 22. In general, in accordance with operator 
instructions ~ which are usually received via key- 
board/trackball interface 70 but which also may be 
received from other sources such as via the local 
area network 31 or wide area network 32 or via 
telephone line 29 by modem or DTMF commands 
"Stored application programs are selected and ac- 
tivated so as to permit processing and manipulation 
of data. For example, any of a variety of application 
programs such as segmentation processing pro- 
grams, recognition processing programs, word pro- 
cessing programs, image editing programs, 
spreadsheet programs and similar information pro- 
cessing programs, may be provided for operator 
selection and use. Thus, a segmentation process- 



816 A2 12 



ing program may be activated whereby a docu- 
ment is scanned by scanner 22 and a gray-scale 
image of the document stored in RAM 79. The 
gray-scale image is then segmentation-processed 

5 according to the stored program instructions, 
whereby text and non-text regions of the document 
are identified and individual characters from text 
regions are extracted. Thereafter, recognition pro- 
cessing programs may be activated which recogni- 

70 tion-process the extracted character images to 
identify the characters and to store them to a text 
file. The resulting text file may then be presented 
to the operator for review and/or manipulation with 
other application programs such as word process- 

75 ing programs, or may be stored to disk or over 
local area network 31, wide area network 32, or 
telephone line 29. 

[2.1 Programmable Function Keys] 

20 

Figures 6 and 7 are close-up views of control 
panel 34 in connection with use and programming 
of programmable function keys 39. 

As mentioned above, PICS equipment 20 is a 

25 networkable device and may be used by any of 
various network users who ordinarily are located 
remotely of PICS equipment 20. Consequently, 
when documents need to be processed by PICS 
equipment 20, a user ordinarily carries the docu- 

30 ments from his workstation to PICS equipment 20. 
It is considered convenient to allow the user to 
program the precise document-processing func- 
tions that will be performed by PICS equipment 20 
from the user's workstation so that those process- 

35 ing functions may be carried out by PICS equip- 
ment 20 with minimal user effort when the user is 
physically present at PICS equipment 20. On the 
other hand, there is a time lapse between when the 
user defines the image processing tasks that will 

40 be performed by PICS equipment 20 and when the 
user has physically arrived at PICS equipment 20 
to carry out those image processing tasks; other 
users should not be precluded from using PICS 
equipment 20 during that time lapse. 

45 As described herein, PICS equipment 20 pref- 

erably includes programmable function keys 39 
which may be programmed by network users from 
their individual workstations, and which can be se- 
lected by the network user when they physically 

50 arrive at PICS equipment 20 for image processing. 
Image processing tasks can include scanning of 
new documents by scanner 22 of PICS equipment 
20, retrieval of existing document images from var- 
ious network storage media, recognition-processing 

55 of document images so as to create text files, and 
storage of text files in various network storage 
media, as well as related tasks such as execution 
of other information processing programs, such as 
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spreadsheet or report-generating word Process ng 
f^ograms, which use the stored text f^les. Function 
keys 39 can be programmed to chain some or an 
of these image processing tasks together, so as to 
provide a macro-like capability whereby a su.te of 
Image processing or related tasks are executed by 
the touch of a single one of function keys 39. 

Preferably, the programmable function keys 39 
are divided into two groups, one of the groups 
being programmable only by network administrator 
« and the other of the groups being program- 
mable by any LAN user. The precise i"iage Pro- 
cessing functions performed by any of the keys 
may be displayed on display 35 as desired. 

Briefly Figures 6 and 7 are used for explaining 
a personal imaging computer system which is con- 
nectable to a local area network and which per- 
forms recognition-processing of document images 
so as to identify characters in the document im- 
ages. A plurality of programmable function keys 
a?e provided on the personal imaging computer 
each of the function keys being manipulable by an 
operator so as to cause the imaging computer 
system to perform a pre-programmed image-pro- 
cessing task. The plural programmable function 
keys are partitioned into at least two groups 
wherein the first group is programmable only by a 
network administrator for the LAN and wherein the 
second group is programmable by any LAN user. 
Display means are provided for displaying an im- 
aae of the plural function keys. In response to 
operator selection of an image of one of the plural 
function keys, the display means displays the func- 
tion performed by that key. 

More particularly, as shown in Figure 6 _ an 
image 75 of programmable keys 39 is displayed by 
display 35. As further shown in Figure 6, the image 
is partitioned into two groups; a first Qrovp TG oi 
programmable function keys which are res noted 
for programming only by network adrr^-n-strator 43 
and a second group-77 of programmable function 
keys which are unrestricted and which are prog- 
rammable by any LAN user. Preferably though no 
shown in Figure 6, each function key displayed at 
75 includes a display of a user identification for the 
user that has currently programmed the key. 

m operation, a user at his workstation 40 des- 
ignates image processing tasks that are desired to 
be performed l?y PICS equipment 20. selects one 
of the programmable function keys in group 7/ 
and programs the function key via local area net- 
work 31. Then, the user carries any documents that 
are to be processed by PICS equipment 20 to the 
physical location of PICS equipment 20. Upon ar- 
riving at PICS equipment 20. the user displays the 
display shown in Figure 6. whereupon he can lo- 
cate the key that he has programmed by reference 
to the displayed user identifications. 



Using trackball 36, the user may then select 
any one of the displayed keys, including a key 
programmed by network administrator 43. a key 
programmed by himself, or a key P^ograrr^n.B6Jv 
5 any other LAN user. Upon selection of a displayed 
ke^ the current functions associated with that key 
are' displayed as shown at 78 in ^f'^l^^^^ 
physically manipulating the actual function key 39. 
PICS equipment 20 automatically performs the in- 
jo dicated function. 



[2.2 Image Resolution Adjustment] 



Figure 8 is a flow diagram showing operation of 
,5 PICS equipment 20 in which a document is 
scanned at a first resolution to form a gray-scale 
image of the document, the first resolution being 
suitable for recognition-processing text in the docu- 
ment, character images in the gray-scale image are 
20 recognition-processed so as to obtain a computer- 
readable file of the text, the resolution of the gray- 
scale image is reduced to a second resolution 
which is lower than the first resolution and which is 
suitable for visual perception and reproduction of 
25 the image, and the reduced-resolution image is 
stored in association with the computer-readable 
text file. Like the remaining flow diagrams in the 
appended figures, the process steps shown in Fig- 
ure 8 are executed by CPU 60 in accordance with 
30 stored program instruction steps stored on com- 
puter disk 75 (or other media) and transferred to 
RAM 79 for execution from there by CPU 60. 

More particularly, at step S801, a document on 
document tray 21 is fed past scanner 22 so as to 
35 scan the document and create an image of the 
document. Preferably, the resolution at which the 
document is scanned is suitable for recognmon 
processing, such as 400 dpi. On-the-fly JPEG pro- 
cessor 72 compresses the image as it is scanned 
40 in. and the compressed image is stored, such as in 
disk 75 or in RAM 79. 

At step S802. the document image is subiected 
to optical character recognition processing to cre- 
ate a text file for text areas of the document. 
45 Optical character recognition processing is de- 
scribed in additional detail with reference to Fig- 
ures 9-1 , 9-2 and 9-3 in section 2.3. below. 

At step S803, the resolution of the document 
image is reduced so as to lower the storage re- 
50 quirements for the document image. Preferably the 
resolution is lowered so as to be sufficient for 
visual perception by human operators and for ade- 
quate reproduction by display on a computer 
screen or printing on paper. Presently. 70 dp. is 
55 preferred. Techniques for lowering image resolution 
are known, and it is preferred to select a technique 
that preserves, to the extent possible, any color or 
gray-scale content in the original image. Suitable 
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techniques may also employ error-diffusion meth- 
ods such as Burkes or Stucki methods so as to 
enhance the appearance of the lowered resolution 
image. 

At step S804, the lowered resolution document 
image, in compressed or uncompressed form as 
desired, is stored in association with the text file 
from step S802. Storage may be to disk 75, but 
more preferably the document image and its asso- 
ciated text file are stored on one of network disks 
42 or 54 as part of a searchable database. 

Thus, as shown in step S805, the document 
image may be retrieved, for example, in response 
to query-based searches of the text file. More 
particularly, based on key word searches or other 
searches performed in response to operator que- 
ries, text files in the database are searched so as 
to identify text files which satisfy operator entered 
queries. Once such text files are identified, asso- 
ciated document images are retrieved and the doc- 
ument images are presented to the operator in 
desired form, such as by display or by printing. 

Because the document is scanned in at a reso- 
lution sufficient for recognition-processing, but then 
is stored at a lowered resolution together with an 
associated text file, storage requirements for stor- 
ing large databases of such documents are signifi- 
cantly lowered. 

[2.3 - Optical Character Recognition Processing — 
Summary] 

Figures 9-1, 9-2 and 9-3 summarize optical 
character recognition-processing by which charac- 
ters in a document are identified, as described 
above at step S802. Briefly, according to any one 
of Figures 9, a document is scanned to obtain a 
gray-scale image of the document, a binary image 
is generated from the gray-scale image by compar- 
ing the gray-scale image with a threshold, the 
binary image is segmented to locate individual 
characters within the binary image and to deter- 
mine the shape of the individual characters, and 
gray-scale image information is extracted for each 
individual character from the gray-scale Image us- 
ing the location and shape of the character in the 
binary image as a template. The extracted gray- 
scale image information is then recognition-pro- 
cessed to determine the identity of the character, 
and the identity of the character is stored. 

Thus, as shown in Figure 9-1 at step 8901, a 
gray-scale image of a document is input. Prefer- 
ably, to input the gray-scale image of a document, 
the document is scanned by scanner 22, but it is 
also possible to input a document image that was 
created elsewhere, for example, a document image 
scanned remotely and transmitted to PICS equip- 
ment 20 via telephone line 29, local area network 
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31, or wide area network 32. 

At step S902, the scanned-in image is de- 
skewed. Image skew can result from improper 
scanning of the document such as by feeding the 

5 document crookedly past scanner 22, or it can 
result from scanning a document which is a mis- 
oriented copy of some other original document. 
Whatever its source, skew can cause errors in 
character recognition, so if its present skew is 

w removed at step S902 as described in greater 
detail in section 2.4 in connection with Figures 10 
and 11. In this regard, it is possible to store the 
skew corrections that are made in step S902 so 
that the skew corrections can be "un-done" after 

75 recognition-processing the image and in prepara- 
tion for image storage (see step S804), but ordinar- 
ily the skewed image is simply discarded and only 
the de-skewed image is retained. 

At step S903, a copy of the gray-scale image 

20 is preserved in RAM 79 so that gray-scale char- 
acter images may later be extracted from it for 
recognition processing (see steps S907 and S908). 

At step S904, a binary image is derived from 
the gray-scale image by comparing the gray-scale 

25 image with a threshold. Thresholding is described 
in greater detail below with reference to Figures 12 
and 13 in section 2.5. The binary image so ob- 
tained is stored in RAM 79. 

At step S905, the binary image is segmenta- 
te tion-processed to distinguish between text and non- 
text areas of the document and to locate individual 
characters within text areas of the document. Seg- 
mentation processing is described below in section 
2.6 in connection with Figure 14. Based on the 

35 location of individual characters within the binary 
image, character templates are obtained (step 
S906) from the shape of the binary character im- 
ages. 

In step S907, gray-scale character images are 
40 extracted from the gray-scale image stored in step 
S903 using the templates derived in step S906. 
The extracted gray-scale character images are then 
recognition-processed (step S908) so as to identify 
each individual character in text areas of the docu- 
45 ment. 

In step S915. the character identities are stored 
in a computer readable text file such as in ASCII 
format. Page reconstruction is undertaken so that 
the reading order of the text file accurately reflects 

50 the reading order in the original document. For 
example, referring briefly back to Figure 2. it will 
be appreciated that one line of text in the left hand 
column should not be followed by a corresponding 
line of text in the right hand column, but rather that 

55 all lines of text in the left hand column should be 
followed by all lines of text in the right hand col- 
umn. Step S915 accomplishes this page recon- 
struction so as to obtain correct reading order for 
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the text file. 

in step S916. the text file is output such as by 
outputting to disk 75 or to network disks 42 and 54. 
As described above at step S804. the text file is 
often stored in association with its document file so 
as to assist in retrieving the document. 

Figure 9-2 is a flow diagram of a character 
recognition processing system which selects one of 
plural recognition processing techniques in accor- 
dance with font characteristics of characters in text 
areas of the document. The selected recognition 
processing technique is specifically tuned to the 
font characteristics, so that, for example, if the font 
characteristics indicate that a uniform-pitch font is 
being used, then a uniform-pitch recognition pro- 
cessing technique is selected, while if the font 
characteristics indicate that a sans-serif font is 
being used, then a sans-serif recognition pro- 
cessing technique is selected. 

Thus, according to the character recognition 
processing system of Figure 9-2. which determines 
the identity of characters from images of the char- 
acters, an image of a document which includes text 
areas is processed to locate lines of characters, 
font characteristics of the characters in each line 
are determined, one of plural recognition process- 
ing techniques is selected based on the font char- 
acteristics so determined, individual character im- 
ages are extracted from each line, and each ex- 
tracted character image is recognition-processed in 
accordance with the selected recognition process- 
ing technique. 

More specifically, in steps S901. S902, S903. 
S904. S905. S906 and S907. a gray-scale image is 
input, the gray-scale image is de-skewed, a copy 
of the de-skewed image is preserved, a binary 
image is derived by global thresholding, the binary 
image is segmentation-processed to locate char- 
acter images, character templates are obtained 
from the shape of the binary images, and char- 
acters are extracted from the gray-scale image 
using the templates, all as described above with 
respect to Figure 9-1 . 

In step S909, font characteristics of characters 
in a line are determined. The determination may be 
made based on character attributes determined 
during segmentation processing, or the determina- 
tion may be made based on the extracted char- 
acters from the binary or gray-scale image. "Font 
characteristics" include character spacing, such as 
uniform or proportional spacing, as well as the 
appearance of the font such as sans-serif or senf 
font, itaiicization, bold, or the like. 

In step S910, one of plural recognition process- 
ing techniques is selected, the selected technique 
being tuned to the specific font characteristics de- 
termined in step S909. More particularly, if it is 
known that a font is. for example. Univers. which is 



a sans-serif font, then a recognition processing 
technique which is specifically tuned to a sans- 
serif font may be used. Such a recognition pro- 
cessing technique is particularly well-suited for rec- 
5 ognition processing sans-serif characters since it 
is known, for example, that there will be fewer 
touching characters in a sans-serif font than with 
a serif font. Likewise, if step S909 determines that 
the font is a uniformly-spaced font such as Courier. 
70 then a uniform-spacing recognition technique may 
be selected which is specifically tuned for that font. 

In step S910. extracted gray-scale character 
images are recognition-processed using the se- 
lected recognition technique. Then, in steps S915 
75 and S916. page reconstruction is carried out so as 
to order identified characters into proper order and 
the text file so created is then output, as explained 
above in connection with Figure 9-1 . 

Figure 9-3 shows alternative processing ac- 
20 cording to the invention which yields improved 
recognition accuracy, especially when processing 
hard-to-recognize fonts such as italicized and pro- 
portionally-spaced fonts- According to the character 
recognition system illustrated in Figure 9-3, the 
25 identity of characters in a document is determined 
by thresholding a gray-scale image of the docu- 
ment to obtain a binary image and by segmenting 
the binary image to locate binary images of char- 
acters and to determine attributes of the binary 
30 images of the characters. Gray-scale images of the 
characters are extracted based on the shape of the 
character in the segmented binary image, and both 
the gray-scale character image and the binary 
character image are recognition-processed to de- 
35 termine identities for the character. Any incon- 
sistencies between the results of recognition-pro- 
cessing the gray-scale character image and rec- 
ognition-processing the binary character image are 
then resolved based on the character attributes 
40 determined during segmentation-processing. 

More particularly, in steps S901 through S908. 
a gray-scale image is input the gray-scale image 
is de-skewed, a binary image is determined by 
thresholding, the binary image is segmentation- 
45 processed to locate character images, character 
templates are obtained from the shape of the bi- 
nary images, gray-scale character images are ex- 
tracted using the templates, and the extracted 
gray-scale character images are recognition-pro- 
50 cessed. as described above in connection with 
Figure 9-1. 

In step S913. binary character images which 
were extracted during segmentation-processing in 
step S905 are recognition-processed to determine 
55 the identity of the binary character images. In step 
S914, any inconsistencies between the results of 
recognition-processing gray-scale character images 
(in step S908). and recognition-processing binary 
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character images (in step S913) are resolved 
based on physical image attributes of character 
images obtained during segmentation processing in 
step S905. For example, it is often difficult to 
distinguish between a lower case "L" ("I"), a 
numeric one ("1") and square brackets ("[" or "]"), 
and because of differences in recognition process- 
ing in steps S908 and S913 it is possible that 
different identities will be determined for any one of 
those characters. In that situation, the inconsisten- 
cies are resolved by reference to physical at- 
tributes obtained during segmentation-processing 
in step S905. More particularly, and as detailed 
below with respect to Figure 14, during segmenta- 
tion processing, physical attributes such as pixel 
density and aspect ratio are determined for each 
character image (more precisely, as explained be- 
low, for each connected component in the image). 
Based on those physical attributes, the results of 
recognition-processing in steps S908 and S913 
may be disambiguated. 

In steps S915 and S916. page reconstruction 
and text outputting is performed as described 
above in connection with Figure 9-1 . 

[2.4 - De-skewing] 

Figures 10(a) and 10(b) and Figures 11(a) 
through 1 1 (c) are used for explaining de-skew pro- 
cessing according to the invention. As seen in 
these figures, an image is de-skewed by determin- 
ing the skew of the image, de-skewing by math- 
ematical rotational transformation in the case where 
the skew is greater than a predetermined limit such 
as ± 10", and de-skewing by vertical shifting of 
pixel data if the skew is less than the predeter- 
mined limit. De-skewing in accordance with this 
technique saves considerable time since, in most 
cases, it will not be necessary to perform a math- 
ematical transformation of pixel data. Mathematical 
transformations of pixel data are expensive' in 
terms of processor time, especially in situations 
where gray-scale pixel data is involved, since each 
pixel in the de-skewed image is a mathematical 
combination of several pixels in the skewed image. 
In addition, since de-skewed pixel values are math- 
ematically calculated, generally speaking, not one 
de-skewed pixel value will be the same as a pixel 
value in the originally-scanned image, leading to 
increased recognition inaccuracies (e.g., replacing 
pixels whose values are "1" and "2", respectively, 
with their average value ("1t") results in pixels 
whose values are not anywhere in the original 
image). On the other hand, simple shifting of the 
skewed image to obtain a de-skewed image does 
not involve such mathematical combinations, and, 
in addition, results in pixel values from the original- 
ly-scanned image. Of course, since vertical shifting 
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introduces some image distortions, if the skew of 
the image is too great, then mathematical trans- 
formations, which do not introduce such distortions, 
cannot be avoided. 

5 More particularly, as shown in Figure 10(a), in 

steps SI 001 through steps S1004 image skew is 
determined by baseline analysis of pixel data in the 
image, such as by application of a modified Hough 
transform like that described by Hinds, et al., "A 

70 Document Skew Detection Method Using Run 
Length Encoding And The Hough Transform", IEEE 
10th International Conference On Pattern Recogni- 
tion, June, 1990, page 464. In more detail, in step 
SI 001, the image is sub-sampled so as to reduce 

7 5 the amount of data that needs to be processed. 
Preferably, the image is sub-sampled so as to yield 
and image with about 100 dpi resolution, which is 
sufficient for accurate skew detection. Thus, if the 
image being de-skewed is input at 400 dpi resolu- 

20 tion, it is sub-sampled in a 1:4 ratio, meaning that 
only every fourth pixel in the original image is used 
in the sub-sampled image, so as to yield a 100 dpi 
sub-sampled image. Sub-sampling ratios are se- 
lected in the same way for different input resolu- 

25 tions, such as a 1:6 ratio for 600 dpi images. 

In step SI 002, the sub-sampled image is 
binarized, such as by application of an arbitrary 
threshold or by application of the threshold that is 
calculated in connection with Figures 12 and 13 

30 (see below). 

In step SI 003, a rough Hough transform is 
applied to the sub-sampled and binarized data so 
as to determine roughly the skew angle in the 
original image. More particularly, the Hough trans- 

35 form may be applied between predetermined limits 
such as t 20' in rough angular resolution, such as 
every one degree. If desired, prior to Hough trans- 
formation, image baseline sensitivity can be am- 
plified by replacing vertical runs of pixel data with 

40 the length of the vertical run positioned at the 
bottom of the vertical run, and by omitting pixel 
data which represents pictures and lines. 

In step Si 004, a fine Hough transform is ap- 
plied to the sub-sampled and binarized image us- 

45 ing the rough skew information obtained in step 
S1003. More particularly, in a ± 1* neighborhood 
around the rough skew angle determined in step 
S1003, a fine Hough transformation is applied in 
fine angular resolution such as .1 *. 

50 In step Si 005, the skew angle determined in 

step Si 004 is compared with a predetermined limit 
such as ± 10'. If the skew is greater than the 
predetermined limit, then flow advances to step 
Si 006 in which the image is de-skewed by math- 

55 ematical transformation. On the other hand, if the 
skew is less than the predetermined limit, flow 
advances to step Si 007 in which a vertical shift 
factor is determined based on the skew. More 
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particularly, as shown in Figure 11(a). a skew angle 
theta (d) is first calculated as described above in 
steps SI 001 to SI 004. Then, from the skew angle 
e. a vertical shift factor that will reduce the skew 
angle 6 to zero is calculated; in the example of: 
Figure 11(a) the vertical shift factor is one pixel 
down for every 13 pixels across, corresponding to ; 
a skew angle of 4.4 degrees. Then, as shown in 
Figure 11(b). working from left to right, all columns 
of the image are successively shifted upwardly or 
downwardly in accordance with the shift factor 
(step S1008). After shifting, it is observed that the 
skew angle 6 has been reduced to zero. 

Reverting to Figure 10(a), once the image has 
been de-skewed either in accordance with math- 
ematical transformation in step SI 006 or in accor- 
dance with pixel shifting in step SI 008. the de- 
skewed image is output (step S1009). 

Although advantageous in terms of the savings 
of processing time, de-skewing by pixel shifting 
can sometimes, in certain circumstances, distort 
the image of the character. For example, as seen 
in Figure 11(b), each of the images of the character 
"a" is distorted because shifting downwardly has 
occurred in the middle of each of those characters. 
Figure 10(b) shows flow processing which avoids 
this kind of distortion. 

In Figure 10(b), steps S1001 through SI 007 
are identical to that of Figure 10(a). In step Si 010, 
when it is time to shift columns of the image 
upwardly or downwardly in accordance with the 
shift factor. CPU 60 determines whether or not the 
image is in a blank space between characters. If 
CPU 60 determines that it is between characters, 
then flow advances to step S1011 in which all 
columns of the image are successively shifted up- 
wardly or downwardly, relative to previously-shifted 
columns, in accordance with the shift factor. On the 
other hand, if not between characters, then the shift 
factor is simply accumulated (step Si 01 2). and no 
shifting takes place. Flow returns to step SI 010 
and shifting takes place only when between char- 
acters. Thus, as shown in Figure 11(c), shifting 
occurs between the two "a" characters and occurs 
at the accumulated shift factor which, in this case, 
is "down 2". Flow then proceeds as before with the 
de-skewed image being output at step Si 01 3. 

By processing in accordance with Figure 10(b), 
it is possible to avoid distortion of each character 
because pixel shifting occurs only between char- 
acters and not in the middle of characters. 

[2.5 - Thresholding] 

Figure 12 is a detailed flow diagram showing 
the thresholding operation depicted in step S904. 
According to the thresholding system shown in 
Figure 12, a binary image is formed from a gray- 



scale image by forming a histogram of pixel inten- 
sities of the gray-scale image, identifying the top 
two groups in the histogram that are separated by 
at least one histogram group, calculating a global 
5 threshold at half the distance between those two 
top groups, comparing each pixel in the gray-scale 
image to the global threshold so as to binarize 
each pixel, and outputting a binary image cor- 
responding to the gray-scale image. 
10 Thus, in step SI 201, a histogram of pixels in 

pixel intensities is formed for the gray-scale image. 
As illustrated in Figure 13(a), the histogram in- 
cludes plural groups of pixel intensities, the height 
of each group being determined based on the 
75 number of pixels in the gray-scale image that fall 
within the group. In Figure 13(a), eight groups. I 
through VIII, have been designated based on a 
gray-scale image intensity which varies from 0 to 
255. Other groupings are possible, but the group- 
20 ing shown in Figure 13(a) is preferred because it is 
simple to implement. 

In step SI 202, the histogram is examined to 
determine whether the gray-scale image is a "re- 
verse video" image, meaning that the image is not 
25 black-on-white, as in conventional images, but rath- 
er is white-on-btack. If the histogram indicates that 
the gray-scale image is a reverse video image, 
then the gray-scale is inverted (step SI 203) so as 
to convert the image to a conventional black-on- 
30 white image. 

In step Si 204. the histogram groups are sorted 
in descending order based on the height of the 
groups. In the example of Figure 13(a), group VIH, 
which has the highest numerical value, is the first 
35 group and group V, which has the lowest numerical 
value, is the last group. Thus, the groups in Figure 
13(a) sorted as shown in Figure 13(b). 

In step SI 205. the top two groups that are 
separated by at least one group are selected. 
40 Thus, as shown in Figure 13(b). groups Vtll and VII 
are first compared because they are the top two 
groups. However, because they are not separated 
by at least one group (i.e. numerically, group VIII is 
immediately adjacent to group VII), groups VIII and 
45 VII are not selected. Instead, groups VII and II, 
which are the next top two groups, are compared. 
Since groups VII and II are separated by at least 
one grouf) (in this example, they are separated 
numerically by four groups), groups VII and II are 
50 selected in step S905. 

In step S1206. the global threshold is cal- 
culated at half the distance between the two groups 
selected in step SI 205. Thus, as shown in Figure 
13(a), groups II and VII are separated by a distance 
55 of 160 (i.e. 192-32). The global threshold for this 
representative gray-scale image is therefore cal- 
culated at TH = 160 - 2 = 80. 
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In step Si 207, the intensity of each pixel in the 
gray-scale images is compared to the global 
threshold calculated in step Si 206 to binarize the 
gray-scale image. As shown In Figure 12, if the 
comparison indicates that the pixel intensity is less 
than the global threshold, then the pixel is set to a 
binary "0" indicating that the pixel is white (step 
SI 208). On the other hand, if the pixel intensity is 
higher than the global threshold, the pixel is set to 
a binary "1" indicating that the pixel is black (step 

51209) . 

After all pixels of the gray-scale image have 
been compared to the global threshold and binariz- 
ed accordingly, the binary image is output (step 

51210) . 

[2.6 - Segmentation-Processing] 

Figure 14 is a flow diagram illustrating seg- 
mentation-processing, as mentioned above in step 
S905, by which text and non-text areas in a docu- 
ment image are identified and by which individual 
characters in text regions are extracted. Processing 
in Figure 14 proceeds by way of connected com- 
ponent analysis of the binary Image derived in step 
S904. A "connected component" is a group of 
connected black pixels surrounded everywhere by 
white pixels. For ordinary printed pages like those 
in a printed copy of this patent application, a con- 
nected component is usually a character or a sepa- 
rated part of a character, but for underlined char- 
acters or cursive script, a connected component 
may be a group of connected characters. 

Generally, as shown in Figure 14, text areas in 
a document image, which includes both text areas 
and non-text areas, are located by identifying con- 
nected components in the document image, deriv- 
ing image attributes such as pixel density and 
aspect ratio for each of the connected components, 
and filtering each connected component based on 
the image attributes so as to separate connected 
components representing a text area from con- 
nected components representing a non-text area. 
Filtering is performed by successively applying 
plural sets of rules to the image attributes for each 
unknown type of connected component until it can 
be determined whether the unknown connected 
component is text or non-text. 

More particularly, in step SI 401, an image to 
be segmentation-processed is input. Preferably, the 
image is the binary image derived by thresholding 
in step 8904 but, in general, the image may be any 
image that needs to be segmentation-processed. 
For example, the image may be an image that is 
scanned by a digital copier in preparation for im- 
age reproduction. In that case, segmentation-pro- 
cessing may be needed to determine which areas 
of the image are text and which are non-text so as 



to control reproduction characteristics based on 
that determination. Thus, the segmentation-pro- 
cessing that is described here may be employed in 
a determination of which areas of an image are text 

5 areas so that those areas may be reproduced by a 
digital copier using only black toner, and which 
areas of an image are non-text areas so that those 
areas may be reproduced by a digital copier using 
cyan, magenta, yellow and black toner in combina- 

70 tion. 

In step S1402, underlining in the image is 
detected and removed. Underlines can distort the 
connected component analysis that Is to follow by 
causing ail underlined characters to be identified as 

75 a single connected component rather than several 
separate connected components. Underline remov- 
al is described In more detail below in section 2.6.1 
in connection with Figures 18 and 19. 

In step SI 403. the Image is analyzed to iden- 

20 tify all connected components. As mentioned 
above, a "connected component" is a group of 
connected black pixels surrounded everywhere by 
white pixels. Thus, as shown in Figure 15, which 
depicts pixels forming an image of the word "//- 

25 nally", connected components are obtained by 
eight-direction analysis of each pixel in the image. 
More particularly, starting from an initial pixel such 
as pixel 80, which is the lower right-most black 
pixel in the Image shown in Figure 15, surrounding 

30 pixels are examined in eight directions as shown 
by cluster 81 to determine where there are any 
adjacent black pixels. Pixel 82 is such a black 
pixel, and eight-direction processing proceeds 
again from pixel 82 whereby the perimeter of a 

35 connected component is traversed as shown by 
arrows 84. 

Each pixel in the image is analyzed as de- 
scribed in Figure 15 so as to identify and obtain 
the location of each connected component in the 
40 image, including internal connected components 
such as individual entries in a framed table (see 16 
in Figure 2). In this embodiment, the location of 
each connected component is defined by the loca- 
tion of a circumscribing rectangle, such as rectan- 
45 gles 85 in Figure 15. 

Although the eight-direction processing shown 
in Figure 15 accurately identifies connected com- 
ponents, it is an expensive procedure in terms of 
CPU processor time and memory storage require- 
50 ments since the entire Image must ordinarily be 
present in memory at one time. Connected compo- 
nent processing described below In section 2.6.2 in 
connection with Figures 20 and 21 Is a more effi- 
cient technique for obtaining connected compo- 
55 nents. and is therefore preferred in this step SI 403. 

In step SI 404. physical Image attributes are 
derived for each connected component. Thus, as 
shown in Figure 16, for each connected compo- 
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nent. image attributes such as aspect ratio, pixel 
count, density, perimeter, perimeter/width ratio, and 
(perimeter)2/area ratio are all derived. In addition, a 
"type" attribute is also associated with each con-' 
nected component. Initially, the type attribute is set 
to "unknown" but ultimately the type for each con- 
nected component wilt be set to "text" or "non- 
text" in accordance with further processing. It 
should be observed that the physical image at- 
tributes that are derived in this step S1404 are the 
attributes which are used in ambiguity-resolution 
step S914 of Figure 9-3. 

In step S1405. the connected components are 
inspected to determine whether the image is ori- 
ented as a portrait or a landscape image. More 
particularly, since most images are scanned in 
portrait orientation, processing described here is 
arranged to handle only portrait-type orientation. 
Accordingly, if a landscape orientation is detected 
in step S1405. flow switches to step SI 406 which 
rotates the image 90 degrees so as to obtain a 
portrait-orientation image. Flow then returns to step 
S1404 to re-derive attributes for each connected 
component. 

Once a portrait orientation image has been 
obtained, flow advances to step S1407 in which, for 
each "unknown" type of connected component, 
plural sets of rules are successively applied to the 
connected component so as to determine whether 
the connected component is a text or a non-text 
connected component. Connected component rules 
are described in more detail with reference to Fig- 
ure 22, but in general, the rules are applied to the 
attributes determined in step S1404 rather than to 
the connected component itself. In addition, the 
rules are preferably organized so that the first rules 
that are applied are simple rules that take little time 
to calculate and which separate, early on, easy-to- 
differentiate text from non-text connected compo- 
nents. Later rules are more complicated and take 
more time to apply, and separate hard-to-differen- 
tiate text from non-text connected components. 
However, because there are fewer "unknown" type 
of connected components at this later processing 
stage, these later rules are applied less frequently. 

In step S1408, "text-type" connected compo- 
nents are analyzed to identify lines of text. Lines of 
text are identified so as to assist in page re- 
construction step S915. In addition, by identifying 
lines of text, it is also possible to re-connect por- 
tions of characters which have been separated by 
connected component analysis. For example, as 
seen in Figure 15. dot 86 over the "i" has been 
separated by connected component analysis from 
the body 87 of the "i". By identifying text lines, as 
shown in step S1408, it is possible to re-connect 
those connected components to form a complete 
character "i" when characters are subsequently cut 



from text lines, as described below in step S1411. 

In step S1409. if there are any touching lines of 
text, they are separated in step S1410. Then, in 
step SI 411, individual characters are cut from the 

5 text line for further processing. Thus, for example, 
in connection with Figures 9-1 to 9-3, the individual 
characters cut from the text line are used as tem- 
plates in step S906 so as to extract characters 
from a gray-scale image of the characters in step 

10 S907. In addition, in step S913, the characters cut 
in this step S1411 are themselves recognition-pro- 
cessed. 

Figure 17 shows how the above processing 
' affects the image of the word " finally ". As shown 
75 in Figure 17, and in accordance with step S901, 
document 90 which includes the printed word "f/^ 
na//y " is scanned in at pixel resolution 91 so as to 
input a gray-scale image 92 of the word " finally ". 
After de-skewing (step S902), a copy of the gray- 
20 scale image is preserved at 93, in accordance with 
step S903. Then, in accordance with step 8904, 
the gray-scale image is thresholded so as to form a 
binary image 94. 

The binary image is then segmentation-pro- 
25 cessed as described above in step S905- More 
particularly, with reference to Figure 14, underlines 
are removed (step SI 402) to yield image 95. 
Through connected component analysis (steps 
SI 403 through SI 41 2), characters 96 are cut from 
30 image 95. Then, templates 97 are obtained (step 
S906) and the templates are applied to the copy 93 
of the gray-scale image so as to extract gray-scale 
character images 98 (step S907). Note that the 
template may be enlarged by about two pixels so 
35 as to be certain that all relevant pixels from the 
gray-scale image are properly extracted- Note also 
that since the gray-scale image 93 is preserved 
with underlines intact, when gray-scale character 
images are subtracted they may include small re- 
40 sidual underlining. These small residuals, however, 
do not interfere with recognition processing. Rec- 
ognition processing is then performed on the ex- 
tracted gray-scale character image so as to identify 
the extracted character image; in this example, for 
45 the character "f", recognition-processing yields the 
ASCII code "66hex" which is the hexadecimal val- 
ue of the ASCII code for the character "f". 

[2.6.1 - Underline Removal] 

50 

Figure 18 is a flow diagram for explaining how 
underlines are removed in accordance with step 
SI 402. Underlines are not literally removed, but 
rather underlined characters are separated from the 
55 underline- Connected component analysis deter- 
mines that the separated underline segments are 
"non-text" and ignores them in subsequent rec- 
ognition processing. 
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Generally speaking, underlines are separated 
fronn underlined characters in an mage by travers- 
ing the innage row-by-row fronn top to bottom, cal- 
culating for each row run lengths of horizontal pixel 
runs in the innage, comparing run lengths in each 
row to run lengths in a previous row, cleaving the 
image horizontally when it is determined that the 
run length of a current row has increased more 
than a predetermined value over the run length of a 
prior row, traversing the cleaved image row-by-row 
from bottom to top, calculating run lengths for 
current rows and comparing run lengths with prior 
rows, and cleaving the image vertically and recon- 
necting a prior horizontal cleave when it is deter- 
mined that the run length of a current row has 
increased by more than the predetermined value 
over a prior row in the same area as where there 
has been a prior adjacent horizontal cleave in the 
image. In addition, by detecting where a prior hori- 
zontal cleave has been made, that is, whether the 
cleave has been made near the center of a char- 
acter or near the edge of the character, the second 
cleaving operation need not be done vertically, but 
rather can be done diagonally so as to preserve 
the shape of certain characters such as a "j" or a 

More particularly, as shown at step S1801, the 
maximum character width "MAX" of characters in 
the document image is first estimated. An accurate 
estimation of the maximum character width is not 
needed for proper operation of the underline re- 
moval technique shown in Figure 18. and only a 
rough estimate of the maximum character width is 
needed. Accordingly, the maximum character width 
may be set to an arbitrary fixed value, such as 
MAX = 50 pixels, or it can be set to approximately 
three times an estimated average character width. 
In this embodiment, an estimated average char- 
acter width is calculated as approximately image 
resolution divided by 16 and the maximum char- 
acter width MAX is set to three times that value. 
Thus, for an image resolution of 400 dpi, MAX = 3 
x 400/16 = 75 pixels. 

In step S1802, the document image is tra- 
versed row-by-row from top to bottom. Then, in 
step Si 803, run lengths of horizontal pixel runs are 
calculated. More particularly, as shown in Figure 
19(a) an arbitrary document image 101 is com- 
prised of pixels making the character string 
" Qqpygj ". For an arbitrary row 102 of pixels in the 
image, the horizontal run length of each horizontal 
run of pixels is calculated. Thus, as shown at 104, 
the horizontal run length of pixels which comprise 
the left most edge of the character "q" is cal- 
culated. Similar run lengths are calculated for each 
horizontal run of pixels in row 102. 

In step Si 804, horizontal run lengths in the 
current row are compared with horizontal run 



lengths in the previous row. If the horizontal run 
length in the current row does not increase by 
more than MAX over the horizontal run lengths in 
the previous row, then no special processing steps 

5 are taken and the next row of the document image 
is selected for processing (step Si 805) and pro- 
cessing continues (step Si 806) until all rows have 
been traversed from top to bottom. On the other 
hand, if the calculation in step 81804 indicates that 

10 the run lengths in the current row have increased 
by more than MAX compared to run lengths in the 
prior row, then the image is cleaved horizontally at 
that row. Figure 19(b) illustrates this process. 

More particularly, as shown in Figure 19(b), 

75 processing has proceeded to a row at which it is 
determined that the presence of underline 103 
makes the horizontal run length for the current row 
increase by more than MAX over the horizontal run 
lengths of the previous row. Accordingly, all pixels 

20 in that row are cleaved horizontally as shown at 
105. Processing then continues with the next and 
subsequent rows (steps Si 805 and Si 806) until all 
rows have been traversed from top to bottom. 

Flow then advances to step SI 808 in which the 

25 cleaved image is traversed row-by-row from bottom 
to top. In step Si 809, run lengths of horizontal 
pixel runs in the current row are calculated, and in 
step SI 810, the run lengths in the current row are 
compared with the run lengths in the previous row. 

30 As before, if the run length of the current row does 
not increase by more than the run length in the 
previous row, then no special processing takes 
place and the next row is selected for processing 
until all rows in the cleaved image have been 

35 traversed from bottom to top. 

On the other hand, if in step Si 810 it is deter- 
mined that the run length of the current row has 
increased by more than MAX over the run length of 
the previous row, then flow advances to step Si 81 3 

40 which determines whether there has been a prior 
horizontal cleave (from step S1807) in an adjacent 
area. If step Si 81 3 determines that there has not 
been a prior horizontal cleave, then as before, no 
special processing is carried out and flow returns 

45 to step S1811 until all rows in the image have been 
traversed from bottom to top. 

On the other hand, if there has been a prior 
horizontal cleave in an adjacent area, then the 
horizontal cleave is reconnected (or closed) and 

50 replaced with a pair of vertical or diagonal cleaves 
as shown in steps 81814 through Si 81 9. More 
particularly, if in step 81814 it is determined that 
there has been a small-sized horizontal cleave near 
the center of a character, such as characters "q". 

55 "p" and "y" in Figure 19(c), then flow advances to 
step S1815 in which the horizontal cleave is recon- 
nected and a pair of vertical cleaves are inserted. 
As specifically shown in Figure 19(c), since a prior 
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horizontal cleave has occurred near the center of 
characters "q". "p" and "y". the horizontal cleave 
Is closed and replaced with a pair of vertical ^ 
cleaves as illustrated at 106. 

In step S1816. if there has been a snnall hori- 
zontal cleave near a character edge, then flow 
advances to step S1817 in which the horizontal 
cleave is reconnected and replaced with a pair of 
diagonal cleaves. More particularly, as shown in 
Figure 19(d). because horizontal cleaves have been 
detected near the character edge for characters, 
"g" and the horizontal cleave is closed and 
replaced by pairs of diagonal cleaves 108. 

In step SI 81 8, if it is determined that there has 
been a large horizontal cleave, then flow advances 
to step SI 81 9 in which the horizontal cleave is 
reconnected and pairs of diagonal cleaves inserted 
at wider spacing than were inserted in step SI 81 7. 
As specifically shown in Figure 19(e), since a large 
horizontal cleave was detected for character "Q". 
the horizontal cleave is closed and replaced by a 
pair of diagonal cleaves 109. 

[2.6.2 Connected Component Analysis] 

Figure 20 is a flow diagram illustrating a pre- 
ferred technique for obtaining connected compo- 
nents (step S1403). Particularly, the connected 
component analysis technique described above in 
section 2.6 was expensive in terms of CPU pro- 
cessing times and memory storage requirements 
because it was necessary for a CPU to compare 
individual pixel bits in image data many times and 
it was also necessary for the entire image to be 
present in memory at the same time. The tech- 
nique described here in Figure 20 only requires 
two rows of the image to be present in memory at 
any one time, and also does not require the CPU to 
access individual pixel bits and image data many 
times but rather allows the CPU to access pixel 
data only once, to obtain horizontal pixel segments.' 
Thereafter, the CPU works simply with the location 
of the horizontal pixel segments. 

Bhefly. according to the technique described in 
connection with Figure 20, a method for obtaining 
connected components in pixel image data in- 
cludes opening a list of connected components 
which initially contains no connected components, 
traversing the image row by row, preferably from 
bottom to top so as to output connected compo- 
nents in proper sequence, identifying all horizontal 
pixel segments in a current row of the image data, 
and comparing horizontal segments in the current 
row to horizontal segments in a previous row to 
determine whether any or all of four different cases 
exist: a first case in which the current row*s seg- 
ment is adjacent an open area in the previous row. 
a second case in which the current row's horizontal 



segment is adjacent a horizontal segment in a 
previous row, a third case in which the current 
row's segment bridges at least two connected 
components in the list of connected components, 
5 and a fourth case in which the previous row's 
horizontal segment is adjacent an open area in the 
current row. If the first case exists, then a new 
connected component is started in the list of con- 
nected components. If the second case exists, then 
70 the trace of the existed connected component of 
the horizontal segment is updated. If the third case 
exists, then the two connected components bridged 
by the horizontal segment are merged. Finally, if 
the fourth case exists, then the trace of the con- 
75 nected component in the list of connected compo- 
nents is closed out. After all rows in the image 
have been traversed, the list of connected compo- 
nents is output for further processing. 

In more detail, as shown in step S2001, a 
20 computerized list of connected components is 
opened. The list is initialized to contain no con- 
nected components, but ultimately the list will con- 
tain all connected components in the image. 

In step S2002, the image is traversed row by 
25 row, preferably from the bottom of the image to the 
top. This ordering is preferred so that the con- 
nected components in the list of connected compo- 
nents are ordered in proper sequential order. 

In step S2003, all horizontal pixel segments in 
30 the current row of the image are identified. More 
particularly, as shown in Figure 21 for an arbitrary 
image 120 of the word "UNION", for row 121 there 
are no horizontal pixel segments. On the other 
hand, for row 122, there are eight horizontal pixel 
35 segments identified at areas 122a, b, c. d, e. f, g 
and h. Each of those eight horizontal pixel seg- 
ments is identified in step S2003. 

Flow then advances to step S2004 which deter- 
mines whether the horizontal pixel segments iden- 
40 tified in step S2003 are adjacent to horizontal seg- 
ments in the previous row of the image. If the 
current row's horizontal segment is not adjacent to 
a horizontal segment in the previous row. then a 
new horizontal segment has been identified and 
45 flow advances to step S2005 in which a new con- 
nected component is started in the list of con- 
nected components. Thus, for example, a new con- 
nected component is started for each of the eight 
horizontal segments 122a, b. c, d, e, f, g and h in 
50 Figure 21. 

On the other hand, if step S2004 determines 
that the current row's horizontal segment is adja- 
cent to a horizontal segment in a previous row, 
then in step S2006 the trace for the existing con- 
55 nected component which corresponds to the hori- 
zontal segment is simply updated. More particu- 
larly, referring again to Figure 21, for row 123, each 
of horizontal segments 123a through 1231 is adja- 
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cent to a horizontal segment in a previous row. 
Accordingly, the trace for the connected compo- 
nent corresponding to those horizontal segments is 
simply updated. In this regard, it should be noted 
that horizontal segments 123c and 123d are both 
contained in the same connected component since 
both horizontal line segments started with a single 
line segment, namely horizontal pixel segment 
122c. Likewise, horizontal pixel segments 123h and 
123i both started from the horizontal pixel segment 
(122f) and they also are both contained in the 
same connected component. 

Step S2007 determines whether a horizontal 
pixel segment bridges two or more connected 
components in the list of connected components. If 
the. horizontal pixel segment bridges two or more 
connected components, then the traces for those 
connected components are merged (step S2008). 
More particularly, as shown for row 124 in Figure 
21, horizontal pixel segment 124a bridges the two 
connected components that were started for hori- 
zontal segments 122a and 122b. Accordingly, 
those two connected components are merged. 
Similarly, horizontal segment 124c bridges the con- 
nected components that were started for horizontal 
segments 122c and 122d. Accordingly, those two 
connected components are merged. It should be 
noted that horizontal pixel segment I24e does not 
bridge two different connected components since 
oniy a single connected component was started at 
122f. 

Step 1709 determines whether a horizontal 
pixel segment in a previous row is adjacent to an 
open segment in the current row. If the previous 
row's horizontal segment is now adjacent to an 
open area, then the connected component has 
been completed and the corresponding connected 
component is closed out (step 82010). 

In any event, flow then advances to step 8201 1 
in which the next row in the image is processed 
until all rows in the image have been complete 
(step S2012). Once the entire image has been 
processed, the list of connected components is 
closed out and the list is output (step 82013) for 
calculation of connected component attributes (see 
step S1404). 

[2.6.3 - Rules For Distinguishing Text From Non- 
Text] 

Figure 22 is a flow diagram illustrating the 
plural sets of rules that are applied to connected 
component attributes to determine whether the 
connected component is a text or a non-text ele- 
ment. The rules are scale-invariant meaning that 
they do not depend for proper operation upon pre- 
knowledge of font size or other size information or 
the document being analyzed. 



The rules are arranged so that those rules 
which are quick and which make easy-to-distin- 
guish determinations between text and non-text 
connected components are applied first, while 

5 those that are more difficult and make hard-to- 
distinguish determinations between text and non- 
text connected components are applied last. Be- 
cause the rules are only applied to "unknown" type 
of connected components, however, the latter rules 

70 are applied only infrequently since text and non- 
text determinations will already have been made by 
the earlier-applied rules. 

In step S2201, the average height of connected 
components is determined so as to permit calcula- 

7 5 tion of scale-invariant parameters for comparison to 
the connected component attributes- Then, in step 
S2202, the parameters are calculated based on the 
average connected component height. Some pa- 
rameters are inherently scale-invariant and need 

20 not be calculated based on average connected 
component height. For example, since aspect ratio 
is the ratio of height to width, it already is scale 
invariant. Other parameters, however, such as mini- 
mum height, are not scale invariant and therefore 

25 are determined in step 82202. 

Plural sets of rules are then applied to each 
connected component whose type remains "un- 
known" as detailed in the remainder of Figure 22. 
Thus, in accordance with rule number 1, the height, 

30 aspect ratio, density, (perimeter)2/area ratio, and 
perimeter/width ratio are all inspected to determine 
if the connected component has the approximate 
height, aspect ratio, density, and parameter of a 
text connected component. If it does, then addi- 

35 tional tests are made on the height, aspect ratio, 
and density of the connected component to deter- 
mine whether it is text or non-text, and the type of 
the connected component is classified accordingly, 
if rule number 1 did not apply and the con- 

40 nected component remains "unknown", then in rule 
number 2 the number of pixels, the perimeter, the 
aspect ratio, and the height are inspected to deter- 
mine if the connected component is small and 
thinner than a ".". If it is, then the connected 

45 component is set to "non-text". 

If rule number 2 did not apply and the con- 
nected component remains "unknown", then in rule 
number 3, the height, aspect ratio, and density of 
the connected component are inspected to deter- 
so mine if the connected component is a slash ("/"). If 
it is, then the connected component is set to 
"text". 

If rule number 3 did not apply and the con- 
nected component remains "unknown", then in rule 
55 number 4, the aspect ratio, height, and density of 
the connected component are examined to deter- 
mine if the connected component is a single small, 
thin character like a "1". "I", etc. If it is. then the 
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connected component is set to "text". 

If rule number 4 did not apply and the con- 
nected component remains "unknown", then in rule 
number 5, the aspect ratio, height, density, and 
(perimeter)2/area ratio of the connected component 
are examined to determine if the connected com- 
ponent is a single small, short character like "-". 

each part of " = " or "%". If it is, then 
the connected component is set to "text". 

If rule number 5 did not apply and the con- . 
nected component remains "unknown", then in rule 
number 6, the aspect ratio, height, and density are 
examined to determine if the connected component 
is a small character like ".", and each part of 
":" or if it is, the connected component type is 

set to "text". 

If rule number 6 did not apply and the con- 
nected component remains "unknown", then in rule 
number 7, the aspect ratio, height, and density of 
the connected component are examined to deter- 
mine if the connected component is a single char- 
acter with a small height and density like ">", "<", 
"u", or a "v". If it is, then the connected 
component type is set to "text". 

If rule number 7 did not apply and the con- 
nected component remains "unknown", then in rule 
number 8. the height, aspect ratio, density, (perim- 
eter)2/area ratio and perimeter/width ratio of the 
connected component are examined to determine 
whether the connected component is wide and 
short like several connected characters in a row. If 
it is, then if the perimeter/width ratio is low or the 
density is high, like a line, then the type of con- 
nected component is set to non-text; if the perim- 
eter per width is high and the density is low. then 
the connected component is set to text. 

If rule number 8 did not apply and the con- 
nected component remains "unknown", then in rule 
number 9, the aspect ratio and density of the 
connected component are examined to determine if 
the connected component is a tall or vertical line 
stroke like "|". If it is, then the connected compo- 
nent type is set to "non-text". 

If rule number 9 did not apply and the con- 
nected component remains "unknown", then in rule 
number 10, the aspect ratio and density of the 
connected component are examined to determine if 
the connected component is a long horizontal line 
stroke. If it is, then the connected component type 
is set to "non-text". 

If rule number 10 did not apply and the con- 
nected component remains "unknown", then in rule 
number 11, the height of the connected component 
is examined to determine whether the connected 
component is a tall non-text region that was not 
picked up in the rule number 9. If it is. then the 
connected component type is set to "non-text". 



If rule 11 did not apply and the connected 
component remains "unknown", then in rule num- 
ber 12. the height and density of the connected 
component are examined to determine if the con- 
5 nected component is a borderline text component 
not already picked up. if it is, then the connected 
component type is set to "text". 

If rule number 12 did not apply and the con- 
nected component remains "unknown", then in rule 
TO number 13 the aspect ratio, height, density, (perim- 
eter)2/area ratio and perimeter/width ratio of the 
connected component are examined to determine if 
the connected component is a series of connected 
letters for short words such as "an", "the", "was", 
75 and the like which were not already picked up by 
rule number 8. If it is, then the connected compo- 
nent is set to "text". 

If rule number 13 did not apply and the con- 
nected component remains "unknown", then in rule 
20 number 14 the aspect ratio and density of the 
connected component are examined to determine if 
the connected component is a non-text blotch. If it 
is, then the connected component is set to "non- 
text". 

25 If rule number 14 did not apply and the con- 

nected component remains "unknown", then in rule 
number 15 the density of the connected compo- 
nent is examined to determine if the connected 
component is a non-text blotch with very high 

30 density, for example, detailed graphics, or a non- 
text blotch with very low density, for example, 
frames surrounding text such as are found in ta- 
bles. If it is, the connected component is set to 
"non-text" - 

35 If rule number 15 did not apply and the con- 

nected component remains "unknown", then in rule 
number 16 the height, density, aspect ratio, (perim- 
eter)2/area ratio and perimeter/width ratio of the 
connected component are examined to determine if 
40 the connected component is a larger-font word 
typically found in titles and headings. If it is, then 
the connected component is set to "text". 

If rule number 16 did not apply and the con- 
nected component remains "unknown", then in rule 
45 number 17 the height, density, aspect ratio, (perim- 
eter)2/area ratio and perimeter/width ratio of the 
connected component are examined to determine if 
the connected component is a non-text element 
that is similar to a larger-font word but which has a 
50 lower perimeter and is therefore non-text. If it is, 
then the connected component is set to "non-text". 

If rule number 17 did not apply and the con- 
nected component remains "unknown", then in rule 
number 18 the height and density of the connected 
55 component are examined to determine if the con- 
nected component is a borderline text block which 
is not picked up in rule number 12. If it is. then the 
connected component is set to "text". 



18 



BhiSOOCID: <£P 067781 6A2> 



35 



EP 0 677 816 A2 



36 



If rule number 18 did not apply and the con- 
nected component remains "unknown", then in rule 
number 19 the (perimeter)2/area ratio, perim- 
eter/width ratio and the density of the connected 
component are examined to determine if the con- 
nected component is a remaining difficult-to-deter- 
mine text connected component. If it is, then the 
connected component is set to "text". 

If rule number 19 did not apply and the con- 
nected component remains "unknown", then in rule 
number 20 (perimeter)2/area ratio, perimeter/width 
ratio and density of the connected component are 
examined to determine if the connected component 
is a difficult-to-determine non-text element not pic- 
ked up in rule number 18. If it is, then the con- 
nected component is set to "non-text". 

If rule number 20 did not apply and the con- 
nected component remains "unknown", then in rule 
number 21 the density, aspect ratio and (perim- 
eter)2/area ratio of the connected component are 
examined to find remaining difficult-to-determine 
text-type connected components not picked up by 
rule number 19. If the connected component is one 
of the remaining difficult-to-determine text-type 
connected component, then the connected compo- 
nent is set to "text". 

If rule number 21 did not apply and the con- 
nected component remains "unknown", then in rule 
number 22 the height, perimeter/width ratio, aspect 
ratio and (perimeter)2/area ratio of the connected 
component are all examined to determine if the 
connected component is an isolated larger-font 
character such as an initial large-font letter in a 
magazine article. If it is, then the connected com- 
ponent is set to "text". 

If rule number 22 did not apply and the con- 
nected component remains "unknown", then in rule 
number 23 the height, perimeter/width ratio and 
aspect ratio of the connected component are ex- 
amined to determine if the connected component is 
an isolated non-text element similar to larger-font 
characters like the font in a heading or a title but 
nevertheless non-text. If it is, then the connected 
component is set to "non-text". 

If rule number 23 did not apply and the con- 
nected component remains "unknown", then in rule 
number 24 the (perimeter)2/area ratio and perim- 
eter/width ratio of the connected component are 
examined to determine if the connected component 
is a very long word or set of connected words. At 
this point in the filtering rules, this filter very rarely 
finds anything but is nevertheless included to in- 
sure that such series of connected words are prop- 
erly designated as "text". If the criteria of the rule 
are met, then the connected component is set to 
"text". 

If rule number 24 did not apply and the con- 
nected component remains "unknown", then in rule 



number 25 remaining connected components are 
set to "non-text". 

In rule number 26, each text connected compo- 
nents is examined and, if the text connected com- 

5 ponent is isolated from other text connected com- 
ponents, then the connected component is set to 
"non-text". This insures that isolated markings on 
the page, such as may be created by stray pencil 
marks or water marks, are not erroneously inter- 

10 preted as text. 

Claims 

1. A personal imaging computer system compris- 
75 ing: 

imaging computer means connectable to a 
local area network (LAN), said imaging com- 
puter means for performing selectable image- 
processing tasks on document images; and 

20 a plurality of programmable function keys 

on said imaging computer means, each of said 
function keys being programmable to perform 
chainable ones of the image-processing tasks 
of said imaging computer means, and each of 

25 said function keys being manipulable by an 

operator so as to cause said imaging computer 
means to chainably perform the pre-pro- 
grammed image-processing tasks. 

30 2. A system according to Claim 1, wherein said 
plurality of programmable function keys are 
partitions into at least two groups, wherein the 
first group is restricted to programming only 
by a network administrator for the LAN and 

35 wherein the second group is programmable by 

any LAN user. 

3. A system according to Claim 1, wherein said 
image-processing tasks include tasks to re- 
40 trieve document images from the LAN, to rec- 

ognition-process character images in the docu- 
ment images, and to store text files to the 
LAN. 

45 4. A system according to Claim 1 , further com- 
prising display means for displaying an image 
of said plural function keys, wherein in re- 
sponse to operator selection of an image of 
one of said plural function keys, said display 

50 means displays the function performed by that 

key. 

5. A system according to Claim 4, wherein, with 
the image of each of said plural function keys, 
55 said display means displays an identification of 

that key. 
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6. A system according to Claim 5. wherein said 
identification comprises a user identification. 

7. Image processing method for a personal imag- 
ing computer system connected to a local area 5 
network (LAN) comprising the steps of: 

programming one of plural programmable 
■ function keys on said personal imaging com- 
puter system, the programmable function key 
being programmed to cause the personal im- w 
aging computer system to perform chainable 
ones of plural selectable image-processing 
tasks; and 

in response to manipulation of a function 
key, chainably performing the image process- 75 
ing task programmed by that key. 

8. A method according to Claim 7, wherein the 
programmable function keys are divided into at 
least two groups, and wherein programmable 20 
function keys in the first group are restricted to 
programming by a network administrator for 

said LAN whereas programmable function keys 
in the second group are. unrestricted and can 
be programmed by any network user. 25 

9. A method according to Claim 8, wherein said 
image-processing tasks include tasks to re- 
trieve document images from the LAN, to rec- 
ognition-process character images in the docu- 30 
ment images, and to store text files to the 
LAN. 

10. A method according to Claim 7, further com- 
prising the steps of: 

displaying an image of said plural prog- 
rammable function keys on said personal im- 
aging computer system; and 

in response to selection of an image of 
one of said displayed function keys, displaying 40 
functions performed by that key. 

11. A method according to Claim 10, wherein in 
said displaying step, with the image of each of 

said plural function keys, an identification of 45 
that key is displayed. 

12. A method according to Claim 11. wherein said 
identification comprises a user identification. 

50 

13. A computerised image processing system 
comprising image processing means (20) and 
a plurality of work stations (40) connected in a 
network, the image processing means having 

one or more programmable function keys, 55 
characterised in that 

the function of a programmable function 
key of the image processing means can be 

20 



programmed from a workstation, preferably a 
workstation physically remote from the image 
processing means. 
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FIG. 1(b) 



4c 



5b 
2c 



































































■■■■■ 


































































m 


m 


■ 


■ 


■ 


























■ 


■ 


m 


m 








nnnrug 






















■ 


■ 


■ 


m 
































■ 


■ 


■ 


m 










C3r~inr~ir~i 


<y 




















■ 


■ 


■ 












E3nnrnn 


^/ 






















■ 




















































□□nnn 






























m 


■ 


■ 




Einnnn 


























■ 


■ 


m 






























■ 


m 


■ 


■ 










□□□□□ 




















■ 


■ 


m 


■ 












Hnnnn 




















■ 


■ 


m 


■ 






























■ 




m 


■ 












HnncD 






















■ 


m 


■ 


■ 










□noco 






<^ 
















■ 


m 


■ 


■ 


■ 




■ 




Hnnnn 


■ 


■ 






















■ 


■ 


■ 








■Bsnn 






































wmwBm 






































■■■■■ 






































■■■■■ 



















































FIG.1(c) 



BNSDOCID:<£P 067781 6A2> 



21 



EP 0 677 816 A2 




FIG. 2 



BNSOOCID:<EP 0677816A2> 



22 



EP 0 677 816 A2 




BNSDOCID:<EP 067781 6A2> 



23 



EP 0 677 816 A2 




BNSDCXJID: <EP 067781 6A2> 



24 



EP 0 677 816 A2 




cr 

LU 



tr 

Q. 



UJ 

o 
cc 

LU 



CD 
CD 



OOluj 
^ "2 o 

<o< 

2X11. 



CD 






LU 


LU 


O 




If 


cr 




LU 


oc 




LU 


1- 


1- 


LU 


z 



CD 



-J 





CO 
CO 

oc 

LU 

z> 

CL 

o 
o 




LO 



CD 





LU 




o 






CS 


$ 




CO 


INTE 



< 
cc 
o 
o 

QC 
Q- 

o 



UJ 
CO 



ii 

tec 
o 



LU 

< 

CO LU 
^ < 

o - 
cc 
Q 



cc 
< 



CD 

Q 
LU 
CC 

o 



CO 
LU 



LU 
H- 

Q 
UJ 
CC 
O 

CO 



CO 
UJ 

o 



UJ 
3 

o 

UJ 

CO 



< 
oc 

o 

oc 

Q. 

Q 
UJ 

cc 

CO O 
LU E 



CO 
UJ 



< 
cc 

LU 
X 



in 

d 

Li. 





qJ2 






JNECTE 
PONEN 


UJ 


INARY 
MAGE 


XTFIL 


CD — 


8i 


TE 



C0KOCC<OUJ 
$OCC^ — 20 



BNSDOCID:<EP 067781 6A2> 



25 



EP 0 677 816 A2 




BNSDOCID: <EP 067781 6A2> 



26 



EP 0 677 816 A2 



[I][D[IJ(D03(IJ(Il[O 



o oooo 
oooo 



0 



2/ 



CO 
CO 




S I - i 

I i 1 1 1 1 

M i s i s 

S I i 8 s 

s » 5 I a 2 

i ^ s I « 

c C5 ff g 



CO 



7i 



0 
0 

0 
0 




d 

u. 



CO 



BNSDOCID:<EP 067781 6A2> 



27 



Ill 



EP 0 677 816 A2 



SCAN DOCUMENT AND STORE IMAGE AT I 
HIGH RESOLUTION | 



S801 



OCR-PROCESS TO CREATE TOa RLE FOR 1 
TEXT AREAS OF DOCUMENT ^ 



S802 



REDUCE RESOLUTION OF DOCUMENT IMAGE 1— S803 



STORE LOWERED RESOLUTION DOCUMENT ^ S804 



IMAGE IN ASSOCIATION WITH TEXT FILE 



RETRIEVE DOCUMENT IMAGE IN RESPONSE | S805 

TO QUERY-BASED SEARCH OF TEXT FILE • 



FIG. 8 



BNSOOCtD: <EP 067781 6A2> 



28 



EP 0 677 816 A2 



INPUT GRAY-SCALE IMAGE — S901 



DE-SKEW IMAGE 



S902 



PRESERVE COPY OF GRAY-SCALE IMAGE I- S903 



I- 



DERIVE BINARY IMAGE BY GLOBAL THRESHOLDING 



S904 



SEGMENTATION-PROCESS BINARY IMAGE TO LOCATE 
CHARACTER IMAGE 




r 


OBTAIN CHARACTER TEMPLATES FROM SHAPE OF 
BINARY CHARACTER IMAGES 






EXTRACT CHARACTERS FROM GRAY-SCALE IMAGE 
USING TEMPLATES 






RECOGNITION-PROCESS 1 
CHARACTE 


EXTRACTED GRAY-SCALE 
:R IMAGES 



S905 



S906 



S907 



— S908 



PAGE RECONSTRUCTION 



OUTPUT TEXT FILE 



S915 



S916 



FIG. 9-1 



BNSDOCID:<EP 067781 6A2> 



29 



ill 



EP 0 677 816 A2 



INPUT GRAY-SCALE IMAGE 



S901 



DE-SKEW 



:W IMAGE I — 



S902 



PRESERVE COPY OF GRAY-SCALE IMAGE 



S903 



DERIVE BINARY IMAGE BY GLOBAL THRESHOLDING | 8904 



SEGMENTATION-PROCESS BINARY IMAGE TO LOCATE I 59^5 
CHARACTER IMAGE \ 



I 



OBTAIN CHARACTER TEMPLATES FROM SHAPE OF 
BINARY CHARACTER IMAGES 



— S906 



EXTRACT CHARACTERS FROM GRAY-SCALE IMAGE L_ 
USING TEMPLATES | 



S907 



I 



DETERMINE FONT CHARACTERISTICS OF CHARACTERS _ §909 
uticnivM. ^ IN ALINE 



I 



SELECT A RECOGNITION-PROCESSING TECHNIQUE | 39^ q 
TUNED TO THE FONT CHARACTERISTICS p 



I 



RECOGNITION^PRJiCESSEXTRACTCDGBAY^^^^ l 



S911 



I 



PAGE RECONSTRUCTION 



V 



OUTPUT TEXT FILE 



S915 



S916 



FIG. 9-2 



BNSDOCID:<EP 067781 6A2> 



30 



EP 0 677 816 A2 



INPUT GRAY-SCALE IMAGE 



S901 



DE-SKEW IMAGE 



S902 



PRESERVE COPY OF GRAY-SCALE IMAGE 



•S903 



DERIVE BINARY IMAGE BY GLOBAL THRESHOLDING 



8904 



SEGMENTATION-PROCESS BINARY IMAGE TO LOCATE 
CHARACTER IMAGE 



— S905 



OBTAIN CHARACTER TEMPLATES FROM SHAPE OF 
BINARY CHARACTER IMAGES 



S906 



EXTRACT CHARACTERS FROM GRAY-SCALE IMAGE _ 0907 
USING TEMPLATES 



RECOGNITION-PROCESS EXTRACTED GRAY-SCALE 
CHARACTER IMAGES 



— S908 



RECOGNITION-PROCESS EXTRACTED BINARY 
CHARACTER IMAGES 



S913 



RESOLVE AMBIGUITIES — S91 4 



PAGE RECONSTRUCTION 



OUTPUT TEXT FILE 



S915 



S916 



FIG. 9-3 



BNSDOCID: <EP 067701 6A2> 



31 



EP 0 677 816 A2 



SUB-SAMPLE IMAGE j- S1001 



BINARIZE 



SI 002 



ROUGH HOUGH TRANSFORM 



I 



003 



FINE HOUGH TRANSFORM TO DETERMINE SKEW ANGLE |- S1004 




S1006 



DE-SKEW BY 
MATHEMATICAL 
TRANSFORMATION 



DETERMINE VERTICAL SHIFT \ 
FACTOR \ 



SI 007 



I 



Z 



S1008 



Qi irrFqSlVELY SHIFT ALL COLUMNS OF IMAGE UP OR 
DOWN FN ACCORDANCE WITH SHIFT FACTOR 



OUTPUT DE-SKEWED IMAGE |— 



SI 009 



FIG. 10(a) 



BNSDOCID:<EP 067781 6A2> 



32 



EP 0 677 816 A2 



SUB-SAMPLE IMAGE 



- S1001 



B1NARI2E -SI 002 



ROUGH HOUGH TRANSFORM 



— SI 003 



FINE HOUGH TRANSFORM TO DETERMINE SKEW ANGLE 



SI 005 




— SI 004 



S1006 



YES 



DE-SKEW BY 
MATHEMATICAL 
TRANSFORMATION 



NO 



DETERMINE VERTICAL SHIFT 
FACTOR 



SlOlOv 



— S1007 




NO 



/S1012 



ACCUMULATE 
SHIFT FACTOR 



SUCCESSIVELY SHIFT ALL COLUMNS OF IMAGE UP OR 
DOWN IN ACCORDANCE WITH SHIFT FACTOR 



-SI Oil 



OUTPUT DE-SKEWED IMAGE 



S1013 



FIG. 10(b) 



BNSDOCID: <EP 067781 6A2> 



33 



EP 0 677 816 A2 



FIG.11(a) 



FIG.11(b) 




FIG.11(c) 



BNSDOCID:<EP 06X781 6A2> 



34 



EP 0 677 816 A2 



FORM HISTOGRAM OF PIXEL INTENSITIES 



S1201 



SI 202 



SI 203 




SELECT TOP TWO GROUPS THAT ARE SEPARATED 
BY AT LEAST ONE GROUP 



CALCULATE GLOBAL THRESHOLD AT HALF THE 
DISTANCE BETWEEN SELECTED GROUPS 



•S1205 



SI 206 



S1207 



COMPARE 
EACH PIXEL 
INTENSITY TO GLOBAL 
THRESHOLD 
7 



HIGHER 



SET PIXEL TO BL^CK ("1") 



SI 209 



/SI 208 



LOWER ^ 


SET PIXEL TO WHITE 




("0") 



OUTPUT BINARY IMAGE 



^ S1210 



FIG. 12 



BNSOOCID:<EP 0677ei6A2> 



35 



EP 0 677 816 A2 



160 



TH = 1 60/2 = 80 



II 



VII 



VIII 



III 



IV 



V 



VI 



VIII 



VII 



FIG. 13(a) 



II 



VI 



111 



IV 



V 



FIG. 13(b) 



CONNECTED COMPONENT 


IMAGE 


LOCATION 


ATTRIBUTES: 




ASPECT RATIO 




PIXEL COUNT 




DENSITY 




PERIMETER 




PERIMETERA'VIDTH RATIO 




(PERIMETER)2/AREA RATIO 




TYPE (I.E. "TEXT" . " NON-TEXT" 
OR "UNKNOWN") 



FIG. 16 



BNSDOClD:<EP 067781 6A2> 



36 



EP 0 677 816 A2 



INPUT IMAGE 



— S1401 



REMOVE UNDERLINES 



OBTAIN CONNECTED COMPONENTS 



SI 402 



SI 403 



DERIVE ATTRIBUTES FOR EACH CONNECTED 
COMPONENT 



81 405 




S1404 



LANDSCAPE 



PORTRAIT 



APPLY TEXT/NON-TEXT FILTER RULES —Si 407 



MAKE LIST OF TEXT LINES 



SI 409 s 




S1408 



YES 



/ 



; SI 406 



ROTATE 
IMAGE 



S1410 



/ 



SEPARATE TOUCHING 
LINES 



CUT CHARACTERS FROM LINES 



OUTPUT CHARACTERS 



S1411 



S1412 



FIG. 14 



B NSDOCI D: <E P 0677B 1 6 A2> 



37 



EP 0 677 816 A2 




EP 0 677 816 A2 




"66h" 



BNSDOClD:<EP 067761 6A2> 



39 



EP 0 677 816 A2 



ESTIMATE MAXIMUM CH ARACTER WIDTH ("MAX-r] - S1801 
TRAVERSE IMAGE FROM TOP TO BOTTOM |— 



SI 802 



CALCULATE HORIZONTAL PIXEL RUN LENGTHS FOR 
CURRENT ROW 



— S1803 



S1804 




INCREASES BY 
MORE THAN 
MAX 



SI 807 



CLEAVE IMAGE 
HORIZONTALLY 



INCREASES BY 
LESS THAN MAX 



NEXT ROW |— SI 805 
SI 806 




TRAVERSE CLEAVED IMAGE FROM I, S1808 
BOTTOM TO TOP T 



CALCULATE HORIZONTAL PIXEL RUN "L si 
LENGTHS FOR CURRENT ROW J 



809 



S1813 




YES 



INCREASES BY 
LESS THAN MAX 



SI 811 



AX 

NEXTROW I (bJ 




FIG. 18(a) 



OUTPUT CHARACTERS SEPARATED 1^ si 820 
FROM THEIR UNDERLINES ^ 



Y 



> <EP 0677816A2> 



40 



EP 0 677 816 A2 




NO 



O FIG. 18(b) 



FIG. 18(a) 
FIG. 18(b) 

FIG. 18 



BNSDOCID:<EP 067781 6A2> 



41 



EP 0 677 816 A2 



-104 




FIG.19(a) 



101 



105 



^103 

FIG.19(b) 





106^ 

FIG.19(c) 



FIG.19(d) 



-108- 



Q_a_P-y— 9-i 



109 



FIG.19(e) 



42 



BNSDCKJID: <EP 067761 6A2> 



EP 0 677 816 A2 



OPEN LIST OF CONNECTED COMPONENTS 


} 




TRAVERSE IMAG 


lE ROW-BY-ROW 



S2001 



S2d02 



IDENTlPi' ALL HORIZONTAL PIXEL SEGMENTS IN CURRENT ROW 



S2004 



IS 

'CURRENT ROW'S SEGMENT^ 
ADJACENT SEGMENT IN 
PREVIOUS ROW 

9 



YES 



NO 



S2003 

/ 82005 

_/ 



START NEW 
CONNECTED 
COMPONENT 



UPDATE TRACE OF EXISTING CONNECTED 
COMPONENT 



- S2006 



y S2007 



82008 



DOES 

CURRENT ROW'S SEGMENT 
BRIDGE TWO CONNECTED 
COMPONENTS 

9 



NO 



YES 



/ 



MERGE TRACES FOR 
THE BRIDGED 
CONNECTED 
COMPONENTS 



IS 



82009 



/S2010 



PREVIOUS ROW'S SEGMENT 
ADJACENT CURRENTLY 
OPEN AREA 

9 



NO 



YES 



CLOSE OUT 
CONNECTED 
COMPONENT 



NEXT ROW 



82011 




FIG. 20 



OUTPUT LIST OF 
CONNECTED COMPONENTS 



S2013 



BNSOCXID: <EP 067781 6A2> 



43 



EP 0 677 816 A2 




BNSOOCID:<£P 067781 6A2> 



44 



EP 0 677 816 A2 



DETERMINE AVERAGE CONNECTED 
COMPONENT C'CC) HEIGHT 



DETERMINE PARAMETERS BASED ON 
AVERAGE CC HEIGHT 



82201 



S2202 



RULE #1 IS CC 

APPROXIMATE HEIGHT. 
DENSITY AND ASPECT 
RATIO OF TEXT 

9 



DETERMINE \s.TYPE = "TEXT- 
YES . ^x^WHETHER TEXTOR 
NON-TEXT 



NO: TYPE = "UNKNOWN- 



TYPE = "NON-TEXT" 



RULE #3.x^MALL AND THINNER^ 



RULE #3 




0 



YES: TYPE = "NON-TEXT" 



UNKNOWN- 




YES: TYPE = "TEXT" 



NO: TYPE = "^UNKNOWN" 




BNSDOClD:<EP 067781 6A2> 



45 



EP 0 677 816 A2 



RULE #4 



RULE #5 



RULE #6 



RULE H7 




HULtffu -^oNG AND SHORT 



YES: TYPE = "NON-TEXT" 



FIG. 22(b) 



BNSDOCID:<EP 067781 6A2> 



46 



EP 0 677 816 A2 




NO: TYPE = "UNKNOWN 




NO: TYPE = "UNKNOWN" 




FIG. 22(c) 



BNSDOCID:<EP 067781 6A2> 



47 



EP 0 677 816 A2 



RULE #14 



RULE #15 



RULE #16 



RULE #17 



= "UNKNOWN" 



xPs^^i7^TrMP ^ YES: TYPE = ■NON-TEXT " 

WITH HIGH DENSITY 
9 



NO: TYPE = "UNKNOWN" 




YES: TYPE = "NON-TEXT" 



0 




YES: TYPE = "TEXT" 



= "UNKNOWN" 



YES: TYPE = "NON-TEXT" 




NO: TYPE = "UNKNOWN" 



FIG. 22(d) 



48 



EP 0 677 816 A2 



RULE #18 



RULE #19 




NO: TYPE = "UNKNOWN 




NO: TYPE = "UNKNOWN" 



RULE #22 




: TYPE = "TEXT 



FIG. 22(e) 



BNSOOC1D:<EP 067781 6 A2> 



49 



» 

EP 0 677 816 A2 



RULE #23 



RULE #24 



RULE #25 



RULE #26 




YES: TYPE = "NON-TEXT' 



= "UNKNOWN" 



0 



YES: TYPE = "TEXT" 



= "UNKNOWN" 



0 



SET REMAINING C 


CS TO NON-TEXT 






^ 

r 


SET TEXT CCS TO NON-TEXT IF 
ISOU^TED FROM OTHER TEXT CCS 



FIG. 22(f) 



FIG. 22(a) 



FIG. 22(b) 



FIG. 22(c) 



FIG. 22(d) 



FIG. 22(e) 



FIG. 22(1) 



FIG. 22 



BNSDOCID:<£P 067781 6A2> 



50 



(19) 



J) 



(12) 



Europaisches Patentamt 
European Patent Office 
Office europeen des brevets (1 1 ) EP0 677 816 A3 

EUROPEAN PATENT APPLICATION 



(88) Date of publication A3: 

22.05.1996 Bulletin 1996/21 

(43) Date of publication A2: 

18.10.1995 Bulletin 1995/42 

(21 ) Application nunnber: 95301 1 59.0 

(22) Date of filing: 22.02.1995 



(51) Int. Ci.^: G06K9/00, G06T 1/00 



(84) Designated Contracting States: 


(72) inventor: Melen, Roger D., 


DE FR GB IT 


c/o Canon R. C. America, Inc. 




Palo Alto, CA 94304 (US) 


(30) Priority: 15.04.1994 US 228419 






(74) Representative: Beresford, Keith Denis Lewis et al 


(71) Applicant: CANON KABUSHIKI KAISHA 


BERESFORD & Co. 


Tokyo (JP) 


2-5 Warwick Court 




High Holborn 




London WC1 R 5DJ (GB) 



(54) Programmable function keys for a networked personal imaging computer system 



(57) A persona! imaging computer system (PICS) 
includes a plurality of programmable function keys which 
xan be programmed so as to cause the PICS equipment 
to perform at least one of plural selectable image- 
processing tasks. The PICS equipment is connected to 
a computerized local area network, and the programma- 
ble function keys can be programmed by LAN users from 
their workstations. Preferably, the programmable func- 
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ming. The PICS equipment includes a display which 
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response to operator selection of an image of one of 
those plural function keys, the function performed by that 
key is displayed. When the physical function key itself is 
manipulated, the PICS equipment executes the pro- 
grammed imaging processing tasks. 
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